Plant yield improvement by ste20-like gene expression

ABSTRACT

The present invention concerns a method for increasing plant yield by modulating expression in a plant of a nucleic acid encoding a Ste20-like polypeptide or a homologue thereof. One such method comprises introducing into a plant a Ste20-like nucleic acid or variant thereof. The invention also relates to transgenic plants having introduced therein a Ste20-like nucleic acid or variant thereof, which plants have increased yield relative to control plants. The present invention also concerns constructs useful in the methods of the invention.

RELATED APPLICATIONS

This application is a divisional of U.S. application Ser. No. 11/988,254filed Jan. 23, 2008, which is a national stage application (under 35U.S.C. 371) of PCT/EP2006/063976 filed Jul. 6, 2006, which claimsbenefit of European application 05106135.6 filed Jul. 6, 2005 and U.S.Provisional application 60/697,338 filed Jul. 8, 2005. The entirecontent of each above-mentioned application is hereby incorporated byreference in its entirety.

SUBMISSION OF SEQUENCE LISTING

The Sequence Listing associated with this application is filed inelectronic format via EFS-Web and hereby incorporated by reference intothe specification in its entirety. The name of the text file containingthe Sequence Listing is Sequence_List_(—)32279_(—)00043. The size of thetext file is 112 KB, and the text file was created on Jan. 16, 2012.

FIELD OF THE INVENTION

The present invention relates generally to the field of molecularbiology and concerns a method for increasing plant yield relative tocontrol plants. More specifically, the present invention concerns amethod for increasing plant yield comprising modulating expression in aplant of a nucleic acid encoding a Ste20-like polypeptide or a homologuethereof. The present invention also concerns plants having modulatedexpression of a nucleic acid encoding a Ste20-like polypeptide or ahomologue thereof, which plants have increased yield relative to controlplants. The invention also provides constructs useful in the methods ofthe invention.

BRIEF SUMMARY OF THE INVENTION

The ever-increasing world population and the dwindling supply of arableland available for agriculture fuels research towards improving theefficiency of agriculture. Conventional means for crop and horticulturalimprovements utilise selective breeding techniques to identify plantshaving desirable characteristics. However, such selective breedingtechniques have several drawbacks, namely that these techniques aretypically labour intensive and result in plants that often containheterogeneous genetic components that may not always result in thedesirable trait being passed on from parent plants. Advances inmolecular biology have allowed mankind to modify the germplasm ofanimals and plants. Genetic engineering of plants entails the isolationand manipulation of genetic material (typically in the form of DNA orRNA) and the subsequent introduction of that genetic material into aplant. Such technology has the capacity to deliver crops or plantshaving various improved economic, agronomic or horticultural traits. Atrait of particular economic interest is yield, necessarily related to aspecified crop, area and/or period of time. Yield is normally defined asthe measurable produce of economic value from a crop. This may bedefined in terms of quantity and/or quality. Yield is directly dependenton several factors, for example, the number and size of the organs,plant architecture (for example, the number of branches), seedproduction and more. Root development, nutrient uptake and stresstolerance may also be important factors in determining yield. Optimizingone of the abovementioned factors may therefore contribute to increasingcrop yield.

Plant biomass is yield for forage crops like alfalfa, silage corn andhay. Many proxies for yield have been used in grain crops. Chief amongstthese are estimates of plant size. Plant size can be measured in manyways depending on species and developmental stage, but include totalplant dry weight, above-ground dry weight, above-ground fresh weight,leaf area, stem volume, plant height, rosette diameter, leaf length,root length, root mass, tiller number and leaf number. Many speciesmaintain a conservative ratio between the size of different parts of theplant at a given developmental stage. These allometric relationships areused to extrapolate from one of these measures of size to another (e.g.Tittonell et al 2005 Agric Ecosys & Environ 105: 213). Plant size at anearly developmental stage will typically correlate with plant size laterin development. A larger plant with a greater leaf area can typicallyabsorb more light and carbon dioxide than a smaller plant and thereforewill likely gain a greater weight during the same period (Fasoula &Tollenaar 2005 Maydica 50:39). This is in addition to the potentialcontinuation of the micro-environmental or genetic advantage that theplant had to achieve the larger size initially. There is a stronggenetic component to plant size and growth rate (e.g. ter Steege et al2005 Plant Physiology 139:1078), and so for a range of diverse genotypesplant size under one environmental condition is likely to correlate withsize under another (Hittalmani et al 2003 Theoretical Applied Genetics107:679). In this way a standard environment is used as a proxy for thediverse and dynamic environments encountered at different locations andtimes by crops in the field.

Harvest index, the ratio of seed yield to above-ground dry weight, isrelatively stable under many environmental conditions and so a robustcorrelation between plant size and grain yield can often be obtained(e.g. Rebetzke et al 2002 Crop Science 42:739). These processes areintrinsically linked because the majority of grain biomass is dependenton current or stored photosynthetic productivity by the leaves and stemof the plant (Gardener et al 1985 Physiology of Crop Plants. Iowa StateUniversity Press, pp 68-73) Therefore, selecting for plant size, even atearly stages of development, has been used as an indicator for futurepotential yield (e.g. Tittonell et al 2005 Agric Ecosys & Environ 105:213). When testing for the impact of genetic differences on stresstolerance, the ability to standardize soil properties, temperature,water and nutrient availability and light intensity is an intrinsicadvantage of greenhouse or plant growth chamber environments compared tothe field. However, artificial limitations on yield due to poorpollination due to the absence of wind or insects, or insufficient spacefor mature root or canopy growth, can restrict the use of thesecontrolled environments for testing yield differences. Therefore,measurements of plant size in early development, under standardizedconditions in a growth chamber or greenhouse, are standard practices toprovide indication of potential genetic yield advantages.

Seed yield is a particularly important trait since the seeds of manyplants are important for human and animal nutrition. Crops such as,corn, rice, wheat, canola and soybean account for over half the totalhuman caloric intake, whether through direct consumption of the seedsthemselves or through consumption of meat products raised on processedseeds. They are also a source of sugars, oils and many kinds ofmetabolites used in industrial processes. Seeds contain an embryo (thesource of new shoots and roots) and an endosperm (the source ofnutrients for embryo growth during germination and during early growthof seedlings). The development of a seed involves many genes, andrequires the transfer of metabolites from the roots, leaves and stemsinto the growing seed. The endosperm, in particular, assimilates themetabolic precursors of carbohydrates, oils and proteins and synthesizesthem into storage macromolecules to fill out the grain. The ability toincrease plant seed yield, whether through seed number, seed biomass,seed development, seed filling, or any other seed-related trait wouldhave many applications in agriculture, and even many non-agriculturaluses such as in the biotechnological production of substances such aspharmaceuticals, antibodies or vaccines.

Ste20 is a Ser/Thr kinase belonging to the group of MAP4 kinases(MAP4Ks, MAP kinase kinase kinase kinases, or MAP3K kinases), and wasfor the first time isolated from yeast. MAP4K are kinases that activateMAP kinase cascades by directly phosphorylating MAP3Ks. A recentphylogenetic study discriminated 6 major groups of MAP4Ks, among whichthe STE20/PAK group of MAP4Ks (Champion et al., Trends Plant Sci. 9,123-129, 2004). Most of the MAP4Ks have an N-terminal catalytic domain,although plant proteins homologous to Ste20 may have a differentorganisation. Members of the Ste20 group of kinases are believed to actas regulators of MAP kinase cascades (Dan et al., Trends Cell Biol. 11,220-230, 2001), and are believed to act in particular upon MAP3 Kinasesof the MEKK and Raf types, downstream of G-proteins (Champion et al.,2004). Yeast Ste20 plays a role in various signalling pathways, forexample Candida Ste20 was shown to be involved in pheromone signalling,invasive growth, hypertonic stress response, cell wall integrity and inbinding of CDC42, required for polarized morphogenesis (Calcagno et al.,Yeast 21, 557-568, 2004). In Drosophila, the Ste20 homologue Hippo isreported to be involved in cell cycle progression (Udan et al., Nat.Cell Biol. 5, 853-855, 2003). In general, the effects of STE20/PAKdirected signalling appear to be nuclear events that influence geneexpression on the one hand, and cytoskeletal events that impact uponcellular dynamics (Bagrodia and Cerione, Trends Cell Biol. 9, 350-355,1999). Although Ste20 and related proteins are relatively well studiedin yeast, Drosophila and in mammalian cells, little or nothing is knownabout the plant homologues of yeast Ste20. Leprince et al, (Biochim.Biophys. Acta 1444, 1-13, 1999) have characterised a MAP4K from Brassicanapus. Its expression seemed regulated by the cell cycle and transcriptswere reported to most abundant in roots, siliques and flower buds.However, no mutants or transgenic plants were described.

Surprisingly, it has now been found that modulating expression in aplant of a nucleic acid encoding a Ste20-like polypeptide or a homologuethereof gives plants having increased yield relative to control plants.Preferably, the Ste20-like polypeptide or a homologue thereof is ofplant origin.

Therefore, the invention provides a method for increasing plant yield,comprising modulating expression in a plant of a nucleic acid encoding aSte20-like polypeptide or a homologue thereof.

Advantageously, performance of the methods according to the presentinvention results in plants having increased yield, particularly seedyield, relative to control plants.

The choice of control plants is a routine part of an experimental setupand may include corresponding wild type plants or corresponding plantswithout the gene of interest. The control plant is typically of the sameplant species or even of the same variety as the plant to be compared.The control plant may also be a nullizygote of the plant to be compared.Nullizygotes are individuals missing the transgene by segregation. A“control plant” as used herein refers not only to whole plants, but alsoto plant parts, including seeds and seed parts.

A “reference”, “reference plant”, “control”, “control plant”, “wildtype” or “wild type plant” is in particular a cell, a tissue, an organ,a plant, or a part thereof, which was not produced according to themethod of the invention. Accordingly, the terms “wild type”, “control”or “reference” are exchangeable and can be a cell or a part of the plantsuch as an organelle or tissue, or a plant, which was not modified ortreated according to the herein described method according to theinvention. Accordingly, the cell or a part of the plant such as anorganelle or a plant used as wild type, control or reference correspondsto the cell, plant or part thereof as much as possible and is in anyother property but in the result of the process of the invention asidentical to the subject matter of the invention as possible. Thus, thewild type, control or reference is treated identically or as identicalas possible, saying that only conditions or properties might bedifferent which do not influence the quality of the tested property.That means in other words that the wild type denotes (1) a plant, whichcarries the unaltered or not modulated form of a gene or allele or (2)the starting material/plant from which the plants produced by theprocess or method of the invention are derived.

Preferably, any comparison between the wild type plants and the plantsproduced by the method of the invention is carried out under analogousconditions. The term “analogous conditions” means that all conditionssuch as, for example, culture or growing conditions, assay conditions(such as buffer composition, temperature, substrates, pathogen strain,concentrations and the like) are kept identical between the experimentsto be compared.

The “reference”, “control”, or “wild type” is preferably a subject, e.g.an organelle, a cell, a tissue, a plant, which was not modulated,modified or treated according to the herein described process of theinvention and is in any other property as similar to the subject matterof the invention as possible. The reference, control or wild type is inits genome, transcriptome, proteome or metabolome as similar as possibleto the subject of the present invention. Preferably, the term“reference-” “control-” or “wild type-”-organelle, -cell, -tissue orplant, relates to an organelle, cell, tissue or plant, which is nearlygenetically identical to the organelle, cell, tissue or plant, of thepresent invention or a part thereof preferably 95%, more preferred are98%, even more preferred are 99.00%, in particular 99.10%, 99.30%,99.50%, 99.70%, 99.90%, 99.99%, 99.999% or more. Most preferable the“reference”, “control”, or “wild type” is preferably a subject, e.g. anorganelle, a cell, a tissue, a plant, which is genetically identical tothe plant, cell organelle used according to the method of the inventionexcept that nucleic acid molecules or the gene product encoded by themare changed, modulated or modified according to the inventive method.

In case, a control, reference or wild type differing from the subject ofthe present invention only by not being subject of the method of theinvention can not be provided, a control, reference or wild type can bea plant in which the cause for the modulation of the activity conferringthe increase of the metabolites as described under examples.

The term “yield” in general means a measurable produce of economicvalue, necessarily related to a specified crop, to an area, and to aperiod of time. Individual plant parts directly contribute to yieldbased on their number, size and/or weight. Whereas the actual yield isthe yield per acre for a crop and year, which is determined by dividingtotal production (includes both harvested and appraised production) byplanted acres.

The terms “increase”, “improving” or “improve” are interchangeable andshall mean in the sense of the application at least a 5%, 6%, 7%, 8%, 9%or 10%, preferably at least 15% or 20%, more preferably 25%, 30%, 35% or40% more yield and/or growth in comparison to the wild type plant asdefined herein.

The increase referred to the activity of the polypeptide amounts in acell, a tissue, a organelle, an organ or an organism or a part thereofpreferably to at least 5%, preferably to at least 10% or at to least15%, especially preferably to at least 20%, 25%, 30% or more, veryespecially preferably are to at least 40%, 50% or 60%, most preferablyare to at least 70% or more in comparison to the control, reference orwild type.

The term “increased yield” as defined herein is taken to mean anincrease in biomass (weight) of one or more parts of a plant, which mayinclude aboveground (harvestable) parts and/or (harvestable) parts belowground.

In particular, such harvestable parts are seeds and leafy biomass, andperformance of the methods of the invention results in plants havingincreased leafy biomass and increased seed yield relative to the seedyield of control plants.

Increased seed yield may manifest itself as one or more of thefollowing: a) an increase in seed biomass (total seed weight) which maybe on an individual seed basis and/or per plant and/or per hectare oracre; b) increased number of flowers per plant; c) increased number of(filled) seeds; d) increased seed filling rate (which is expressed asthe ratio between the number of filled seeds divided by the total numberof seeds; e) increased harvest index, which is expressed as a ratio ofthe yield of harvestable parts, such as seeds, divided by the totalbiomass; and f) increased thousand kernel weight (TKW), which isextrapolated from the number of filled seeds counted and their totalweight. An increased TKW may result from an increased seed size and/orseed weight, and may also result from an increase in embryo and/orendosperm size.

The term “expression” or “gene expression” means the appearance of aphenotypic trait as a consequence of the transcription of a specificgene or specific genes. The term “expression” or “gene expression” inparticular means the transcription of a gene or genes into structuralRNA (rRNA, tRNA) or mRNA with subsequent translation of the latter intoa protein. The process includes transcription of DNA, processing of theresulting mRNA product and its translation into an active protein.

The term “modulation” means in relation to expression or geneexpression, a process in which the expression level is changed by saidgene expression in comparison to the control plant, preferably theexpression level is increased. The original, unmodulated expression maybe of any kind of expression of a structural RNA (rRNA, tRNA) or mRNAwith subsequent translation. The term “modulating the activity” shallmean any change of the expression of the inventive nucleic acidsequences or encoded proteins, which leads to increased yield and/orincreased growth of the plants.

An increase in seed yield may also be manifested as an increase in seedsize and/or seed volume, which may also influence the composition ofseeds (including oil, protein and carbohydrate total content andcomposition). Furthermore, an increase in seed yield may also manifestitself as an increase in seed area and/or seed length and/or seed widthand/or seed perimeter. Increased yield may also result in modifiedarchitecture, or may occur because of modified architecture.

Taking corn as an example, a yield increase may be manifested as one ormore of the following: increase in the number of plants per hectare oracre, an increase in the number of ears per plant, an increase in thenumber of rows, number of kernels per row, kernel weight, thousandkernel weight, ear length/diameter, increase in the seed filling rate(which is the number of filled seeds divided by the total number ofseeds and multiplied by 100), among others. Taking rice as an example, ayield increase may manifest itself as an increase in one or more of thefollowing: number of plants per hectare or acre, number of panicles perplant, number of spikelets per panicle, number of flowers (florets) perpanicle (which is expressed as a ratio of the number of filled seedsover the number of primary panicles), increase in the seed filling rate(which is the number of filled seeds divided by the total number ofseeds and multiplied by 100), increase in thousand kernel weight, amongothers. An increase in yield may also result in modified architecture,or may occur as a result of modified architecture.

According to a preferred feature, performance of the methods of theinvention result in plants having increased yield, particularly seedyield. Therefore, according to the present invention, there is provideda method for increasing plant yield, which method comprises modulatingexpression in a plant of a nucleic acid encoding a Ste20-likepolypeptide or a homologue thereof.

Since the transgenic plants according to the present invention haveincreased yield, it is likely that these plants exhibit an increasedgrowth rate (during at least part of their life cycle), relative to thegrowth rate of corresponding wild type plants at a corresponding stagein their life cycle. The increased growth rate may be specific to one ormore parts of a plant (including seeds), or may be throughoutsubstantially the whole plant. Plants having an increased growth ratemay have a shorter life cycle. The life cycle of a plant may be taken tomean the time needed to grow from a dry mature seed up to the stagewhere the plant has produced dry mature seeds, similar to the startingmaterial. This life cycle may be influenced by factors such as earlyvigour, growth rate, flowering time and speed of seed maturation. Anincrease in growth rate may take place at one or more stages in the lifecycle of a plant or during substantially the whole plant life cycle.Increased growth rate during the early stages in the life cycle of aplant may reflect enhanced vigour. The increase in growth rate may alterthe harvest cycle of a plant allowing plants to be sown later and/orharvested sooner than would otherwise be possible (a similar effect maybe obtained with earlier flowering time). If the growth rate issufficiently increased, it may allow for the further sowing of seeds ofthe same plant species (for example sowing and harvesting of rice plantsfollowed by sowing and harvesting of further rice plants all within oneconventional growing period). Similarly, if the growth rate issufficiently increased, it may allow for the further sowing of seeds ofdifferent plants species (for example the sowing and harvesting of riceplants followed by, for example, the sowing and optional harvesting ofsoy bean, potato or any other plant). Harvesting additional times fromthe same rootstock in the case of some crop plants may also be possible.Altering the harvest cycle of a plant may lead to an increase in annualbiomass production per acre (due to an increase in the number of times(say in a year) that any particular plant may be grown and harvested).An increase in growth rate may also allow for the cultivation oftransgenic plants in a wider geographical area than their wild-typecounterparts, since the territorial limitations for growing a crop areoften determined by adverse environmental conditions either at the timeof planting (early season) or at the time of harvesting (late season).Such adverse conditions may be avoided if the harvest cycle isshortened. The growth rate may be determined by deriving variousparameters from growth curves, such parameters may be: T-Mid (the timetaken for plants to reach 50% of their maximal size) and T-90 (timetaken for plants to reach 90% of their maximal size), amongst others.

According to a preferred feature of the present invention, performanceof the methods of the invention gives plants having an increased growthrate or increased yield in comparison to control plants. Therefore,according to the present invention, there is provided a method forincreasing yield and/or the growth rate of plants, which methodcomprises modulating, preferably increasing, expression in a plant of anucleic acid encoding a Ste20-like protein.

An increase in yield and/or growth rate occurs whether the plant isunder non-stress conditions or whether the plant is exposed to variousstresses compared to control plants. Plants typically respond toexposure to stress by growing more slowly. In conditions of severestress, the plant may even stop growing altogether. Mild stress on theother hand is defined herein as being any stress to which a plant isexposed which does not result in the plant ceasing to grow altogetherwithout the capacity to resume growth. Mild stress in the sense of theinvention leads to a reduction in the growth of the stressed plants ofless than 40%, 35% or 30%, preferably less than 25%, 20% or 15%, morepreferably less than 14%, 13%, 12%, 11% or 10% or less in comparison tothe control plant under non-stress conditions. Due to advances inagricultural practices (irrigation, fertilization, pesticide treatments)severe stresses are not often encountered in cultivated crop plants. Asa consequence, the compromised growth induced by mild stress is often anundesirable feature for agriculture. Mild stresses are the typicalstresses to which a plant may be exposed. These stresses may be theeveryday biotic and/or abiotic (environmental) stresses to which a plantis exposed. Typical abiotic or environmental stresses includetemperature stresses caused by atypical hot or cold/freezingtemperatures; salt stress; water stress (drought or excess water).Chemicals may also cause abiotic stresses. Biotic stresses are typicallythose stresses caused by pathogens, such as bacteria, viruses, fungi andinsects. In another preferred embodiment of the invention an increase inyield and/or growth rate occurs according to the method of inventionunder non-stress or mild abiotic or biotic stress conditions, preferablyon non-stress or mild abiotic stress conditions.

The abovementioned growth characteristics may advantageously be modifiedin any plant.

The term “plant” as used herein encompasses whole plants, ancestors andprogeny of the plants, plant cells and plant parts, including seeds,shoots, stems, leaves, roots (including tubers), flowers, and tissuesand organs, wherein each of the aforementioned comprise the gene/nucleicacid of interest. The term “plant” also encompasses suspension cultures,callus tissue, embryos, meristematic regions, gametophytes, sporophytes,pollen and microspores, again wherein each of the aforementionedcomprise the gene/nucleic acid of interest.

Plants that are particularly useful in the methods of the inventioninclude all plants which belong to the superfamily Viridiplantae, inparticular monocotyledonous and dicotyledonous plants including fodderor forage legumes, ornamental plants, food crops, trees or shrubsselected from the list comprising Acer spp., Actinidia spp., Abelmoschusspp., Agropyron spp., Allium spp., Amaranthus spp., Ananas comosus,Annona spp., Apium graveolens, Arabidopsis thaliana, Arachis spp,Artocarpus spp., Asparagus officinalis, Avena sativa, Averrhoacarambola, Benincasa hispida, Bertholletia excelsea, Beta vulgaris,Brassica spp., Cadaba farinosa, Camellia sinensis, Canna indica,Capsicum spp., Carex elata, Carica papaya, Carissa macrocarpa, Caryaspp., Carthamus tinctorius, Castanea spp., Cichorium endivia, Cinnamomumspp., Citrullus lanatus, Citrus spp., Cocos spp., Coffea spp., Colocasiaesculenta, Cola spp., Coriandrum sativum, Corylus spp., Crataegus spp.,Crocus sativus, Cucurbita spp., Cucumis spp., Cynara spp., Daucuscarota, Desmodium spp., Dimocarpus longan, Dioscorea spp., Diospyrosspp., Echinochloa spp., Eleusine coracana, Eriobotrya japonica, Eugeniauniflora, Fagopyrum spp., Fagus spp., Ficus carica, Fortunella spp.,Fragaria spp., Ginkgo biloba, Glycine spp., Gossypium hirsutum,Helianthus spp., Hemerocallis fulva, Hibiscus spp., Hordeum spp.,Ipomoea batatas, Juglans spp., Lactuca sativa, Lathyrus spp., Lensculinaris, Linum usitatissimum, Litchi chinensis, Lotus spp., Luffaacutangula, Lupinus spp., Luzula sylvatica, Macrotyloma spp., Malusspp., Malpighia emarginata, Mammea americana, Mangifera indica, Manihotspp., Manilkara zapota, Medicago sativa, Melilotus spp., Mentha spp.,Momordica spp., Morus nigra, Musa spp., Nicotiana spp., Olea spp.,Opuntia spp., Ornithopus spp., Oryza spp., Panicum miliaceum, Passifloraedulis, Pastinaca sativa, Persea spp., Petroselinum crispum, Phaseolusspp., Phoenix spp., Physalis spp., Pinus spp., Pistacia vera, Pisumspp., Poa spp., Populus spp., Prosopis spp., Prunus spp., Psidium spp.,Punica granatum, Pyrus communis, Quercus spp., Raphanus sativus, Rheumrhabarbarum, Ribes spp., Rubus spp., Saccharum spp., Sambucus spp.,Secale cereale, Sesamum spp., Sinapis sp., Solanum spp., Sorghumbicolor, Spinacia spp., Syzygium spp., Tamarindus indica, Theobromacacao, Trifolium spp., Triticosecale rimpaui, Triticum spp., Tropaeolumminus, Tropaeolum majus, Vaccinium spp., Vicia spp., Vigna spp., Violaodorata, Vitis spp., Zea mays, Zizania palustris, Ziziphus spp., amongstothers. According to a preferred embodiment of the present invention,the plant is a crop plant such as soybean, sunflower, canola, alfalfa,rapeseed, cotton, tomato, potato or tobacco. Further preferably, theplant is a monocotyledonous plant, such as sugar cane. More preferablythe plant is a cereal, such as rice, maize, wheat, barley, millet, rye,sorghum or oats.

Other advantageous plants are selected from the group consisting ofAsteraceae such as the genera Helianthus, Tagetes e.g. the speciesHelianthus annus [sunflower], Tagetes lucida, Tagetes erecta or Tagetestenuifolia [Marigold], Brassicaceae such as the genera Brassica,Arabadopsis e.g. the species Brassica napus, Brassica rapa ssp. [canola,oilseed rape, turnip rape] or Arabidopsis thaliana. Fabaceae such as thegenera Glycine e.g. the species Glycine max, Soja hispida or Soja max[soybean]. Linaceae such as the genera Linum e.g. the species Linumusitatissimum, [flax, linseed]; Poaceae such as the genera Hordeum,Secale, Avena, Sorghum, Oryza, Zea, Triticum e.g. the species Hordeumvulgare [barley]; Secale cereale [rye], Avena sativa, Avena fatua, Avenabyzantina, Avena fatua var. sativa, Avena hybrida [oat], Sorghum bicolor[Sorghum, millet], Oryza sativa, Oryza latifolia [rice], Zea mays [corn,maize] Triticum aestivum, Triticum durum, Triticum turgidum, Triticumhybernum, Triticum macha, Triticum sativum or Triticum vulgare [wheat,bread wheat, common wheat]; Solanaceae such as the genera Solanum,Lycopersicon e.g. the species Solanum tuberosum [potato], Lycopersiconesculentum, Lycopersicon lycopersicum., Lycopersicon pyriforme, Solanumintegrifolium or Solanum lycopersicum [tomato].

The term “Ste20-like polypeptide or homologue thereof” as defined hereinrefers to a MAP4K polypeptide, preferably of plant origin, comprising anN-terminal Ser/Thr kinase domain (matching the SMART database entrySM00220, InterPro accession IPR002290). The kinase domain in SEQ ID NO:2 starts at Y15 and ends at F293, and comprises the ProSite Ser/Thrprotein kinase pattern PS00108:

[LIVMFYC]-x-[HY]-x-D-[LIVMFY]-K-x(2)-N-[LIVMFYCT](3), wherein the firstx is missing. Preferably, the Ste20-like polypeptide or homologuethereof comprises the Ste20 signature sequence:

(SEQ ID NO: 6; Dan et al., 2001) G(T/N)P(Y/C/R)(W/R)MAPE(V/K),more preferably it comprises the sequence motif:

(SEQ ID NO: 7)(S/H/N)(I/L)(V/I/L/M)(S/K)(S/H/T/A/I/V)(S/G/V/A) (F/Y)(P/Q)(S/N/D/E)G,most preferably the Ste20-like polypeptide or homologue thereofcomprises at least one of the following sequence motifs:

SEQ ID NO: 8 (V/I)HSH(T/N/V)GY(G/S)(F/I), SEQ ID NO: 9 RPPLSHLPP(L/S)KS,SEQ ID NO: 10 RRISGWNF, At the C-terminal end of the protein, a coiled coil motif may be present(K450 to T477 in SEQ ID NO: 2).

The term “domain” refers to a set of amino acids conserved at specificpositions along an alignment of sequences of evolutionarily relatedproteins. While amino acids at other positions can vary betweenhomologues, amino acids that are highly conserved at specific positionsindicate amino acids that are essential in the structure, the stability,or the activity of a protein. Identified by their high degree ofconservation in aligned sequences of a family of protein homologues,they can be used as identifiers to determine if any polypeptide inquestion belongs to a previously identified polypeptide family (in thiscase, the family of Ste20-like proteins). The term “motif” refers to ashort conserved region in a protein sequence. Motifs are frequentlyhighly conserved parts of domains, but may also include only part of thedomain, or be located outside of conserved domain (if all of the aminoacids of the motif fall outside of a defined domain).

Specialist databases exist for the identification of domains. The kinasedomain in a Ste20-like protein may be identified using, for example,SMART (Schultz et al. (1998) Proc. Natl. Acad. Sci. USA 95, 5857-5864;Letunic et al. (2002) Nucleic Acids Res 30, 242-244), InterPro (Mulderet al., (2003) Nucl. Acids. Res. 31, 315-318), Prosite (Bucher andBairoch (1994), A generalized profile syntax for biomolecular sequencesmotifs and its function in automatic sequence interpretation. (In)ISMB-94; Proceedings 2nd International Conference on Intelligent Systemsfor Molecular Biology. Altman R., Brutlag D., Karp P., Lathrop R.,Searls D., Eds., pp 53-61, AAAIPress, Menlo Park; Hulo et al., Nucl.Acids. Res. 32:D134-D137, (2004)) or Pfam (Bateman et al., Nucleic AcidsResearch 30(1): 276-280 (2002)). A set of tools for in silico analysisof protein sequences is available on the ExPASY proteomics server(hosted by the Swiss Institute of Bioinformatics (Gasteiger et al.,ExPASy: the proteomics server for in-depth protein knowledge andanalysis, Nucleic Acids Res. 31:3784-3788 (2003)).

By aligning other protein sequences with SEQ ID NO: 2, the correspondingSte20 signature sequence, the kinase domain and other sequence motifsdetailed above may easily be identified. In this way, Ste20-likepolypeptides or homologues thereof (encompassing orthologues andparalogues) may readily be identified, using routine techniques wellknown in the art, such as by sequence alignment. Methods for thealignment of sequences for comparison are well known in the art, suchmethods include GAP, BESTFIT, BLAST, FASTA and TFASTA. GAP uses thealgorithm of Needleman and Wunsch ((1970) J Mol Biol 48: 443-453) tofind the alignment of two complete sequences that maximizes the numberof matches and minimizes the number of gaps. The BLAST algorithm(Altschul et al. (1990) J Mol Biol 215: 403-10) calculates percentsequence identity and performs a statistical analysis of the similaritybetween the two sequences. The software for performing BLAST analysis ispublicly available through the National Centre for BiotechnologyInformation. Homologues may readily be identified using, for example,the ClustalW multiple sequence alignment algorithm (version 1.83), withthe default pairwise alignment parameters, and a scoring method inpercentage. Global percentages of similarity and identity may also bedetermined using one of the methods available in the MatGAT softwarepackage (Campanella et al., BMC Bioinformatics. 2003 Jul. 10; 4:29.MatGAT: an application that generates similarity/identity matrices usingprotein or DNA sequences.). Minor manual editing may be performed tooptimise alignment between conserved motifs, as would be apparent to aperson skilled in the art. Furthermore, instead of using full lengthsequences for the identification of homologues, specific domains (suchas the kinase domain) may be used as well. The sequence identity values,which are indicated above as a percentage were determined over theentire conserved domain or nucleic acid or amino acid sequence using theprograms mentioned above using the default parameters.

Examples of Ste20-like polypeptides or homologues thereof include theArabidopsis sequences SEQ ID NO: 12 (corresponding to At5g14720, encodedby GenBank accession number AAL38867), SEQ ID NO: 14 (At1g70430, GenBankNP_(—)177200), SEQ ID NO: 16 (At1g23700, GenBank NP_(—)173782), SEQ IDNO: 18 (At1g79640, GenBank NP_(—)178082), SEQ ID NO: 20 (At4g10730,GenBank NP_(—)192811), SEQ ID NO: 22 (At4g24100, GenBank NP_(—)194141);the rice sequences SEQ ID NO: 24 (GenBank XP_(—)469286, the riceorthologue of SEQ ID NO: 2), SEQ ID NO: 26 (GenBank BAD37346), SEQ IDNO: 28 (GenBank XP_(—)468215), SEQ ID NO: 30 (GenBank AAL54869), SEQ IDNO: 32 (GenBank NP_(—)912431) and the Medicago sequence SEQ ID NO: 34(orthologue of SEQ ID NO: 2).

It is to be understood that sequences falling under the definition of“Ste20-like polypeptide or homologue thereof” are not to be limited tothe sequences represented by SEQ ID NO: 2, SEQ ID NO: 12, SEQ ID NO: 14,SEQ ID NO: 16, SEQ ID NO: 18, SEQ ID NO: 20, SEQ ID NO: 22, SEQ ID NO:24, SEQ ID NO: 26, SEQ ID NO: 28, SEQ ID NO: 30, SEQ ID NO: 32 or SEQ IDNO: 34, but that any polypeptide comprising a N-terminal kinase domainas defined above, the Ste20 signature sequence and preferably also oneor more of the sequence motifs detailed in SEQ ID NO: 7, 8, 9 and 10,may be suitable for use in the methods of the invention. In a preferredembodiment, the homologue used in the methods of the present inventionis an othologue of SEQ ID NO: 2.

An assay may be carried out to determine Ste20-like activity. Forexample to determine the kinase activity, several assays are availableand well known in the art (for example Current Protocols in MolecularBiology, Volumes 1 and 2, Ausubel et al. (1994), Current Protocols). TheSte20-like protein is a MAP4K kinase involved in signal transduction.For several organisms, the substrate of Ste20 was identified as Ste11p(Drogen et al., Current Biology 10, 630-639, 2000). Besides in vitrophosphorylation of the Ste11p protein, Ste20 was also shown tophosphorylate histone H2B (Ahn et al., Cell 120, 25-36, 2005). Buffercomposition, ionic strength, and pH may be optimized starting from astandard kinase assay mixture. A standard 5× Kinase Buffer generallycontains 5 mg/ml BSA (Bovine Serum Albumin preventing kinase adsorptionto the assay tube), 150 mM Tris-CI (pH 7.5), 100 mM MgCl₂. Divalentcations are required for most tyrosine kinases, although some tyrosinekinases (for example, insulin-, IGF-1-, and PDGF receptor kinases)require MnCl₂ instead of MgCl₂ (or in addition to MgCl₂). The optimalconcentrations of divalent cations must be determined empirically foreach protein kinase. A commonly used donor for the phophoryl group isradio-labelled [gamma-³²P]ATP (normally at 0.2 mM final concentration).The amount of ³²P incorporated in the peptides may be determined bymeasuring activity on the nitrocellulose dry pads in a scintillationcounter.

Furthermore, expression of the Ste20-like protein or of a homologuethereof in plants, and in particular in rice, has the effect ofincreasing yield of the transgenic plant when compared to controlplants, wherein increased yield comprises at least one of: total weightof seeds, number of filled seeds and harvest index.

“Homologues” of a protein encompass peptides, oligopeptides,polypeptides, proteins and enzymes having amino acid substitutions,deletions and/or insertions relative to the unmodified protein inquestion and having similar biological and functional activity as theunmodified protein from which they are derived.

Encompassed by the term “homologues” are orthologous sequences andparalogous sequences, two special forms of homology which encompassevolutionary concepts used to describe ancestral relationships of genes.Paralogues are genes within the same species that have originatedthrough duplication of an ancestral gene and orthologues are genes fromdifferent organisms that have originated through speciation.

Orthologues and paralogues may easily be found by performing a so-calledreciprocal blast search. This may be done by a first BLAST involvingBLASTing a query sequence (for example, SEQ ID NO: 1 or SEQ ID NO: 2)against any sequence database, such as the publicly available NCBIdatabase. BLASTN or TBLASTX (using standard default values) may be usedwhen starting from a nucleotide sequence and BLASTP or TBLASTN (usingstandard default values) may be used when starting from a proteinsequence. The BLAST results may optionally be filtered. The full-lengthsequences of either the filtered results or non-filtered results arethen BLASTed back (second BLAST) against sequences from the organismfrom which the query sequence is derived (where the query sequence isSEQ ID NO: 1 or SEQ ID NO: 2, the second BLAST would therefore beagainst Arabidopsis sequences). The results of the first and secondBLASTs are then compared. A paralogue is identified if a high-rankinghit from the second BLAST is from the same species as from which thequery sequence is derived; an orthologue is identified if a high-rankinghit is not from the same species as from which the query sequence isderived. Preferred orthologues are orthologues of SEQ ID NO: 1 or SEQ IDNO: 2. High-ranking hits are those having a low E-value. The lower theE-value, the more significant the score (or in other words the lower thechance that the hit was found by chance). Computation of the E-value iswell known in the art. In addition to E-values, comparisons are alsoscored by percentage identity. Percentage identity refers to the numberof identical nucleotides (or amino acids) between the two comparednucleic acid (or polypeptide) sequences over a particular length.Preferably the score is greater than 50, more preferably greater than100; and preferably the E-value is less than e-5, more preferably lessthan e-6. In the case of large families, ClustalW may be used, followedby the generation of a neighbour joining tree, to help visualizeclustering of related genes and to identify orthologues and paralogues.Examples of sequences orthologous to SEQ ID NO: 2 include SEQ ID NO: 24and SEQ ID NO: 34. Examples of paralogues of SEQ ID NO: 2 include SEQ IDNO: 12 (At5g14720) and SEQ ID NO: 20 (At4g10730).

Preferably, the kinase domains of Ste20-like proteins useful in themethods of the present invention have, in increasing order ofpreference, at least 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%,90%, 95%, 96%, 97%, 98% or 99% sequence identity to the kinase domain ofthe Ste20 protein of SEQ ID NO: 2. An example detailing theidentification of homologues is given in Example 1. The matrix shown inExample 1 (Table 4) shows similarities and identities (in bold) over thefull-length of the protein. In case only specific domains are compared,the identity or similarity may be higher among the different proteins(Table 5: comparison of the kinase domains only).

A Ste20-like polypeptide or homologue thereof is encoded by a Ste20-likenucleic acid/gene. Therefore the term “Ste20-like nucleic acid/gene” asdefined herein is any nucleic acid/gene encoding a Ste20-likepolypeptide or a homologue thereof as defined above.

Examples of Ste20-like nucleic acids include but are not limited tothose represented by any one of SEQ ID NO: 3, SEQ ID NO: 11, SEQ ID NO:13, SEQ ID NO: 15, SEQ ID NO: 17, SEQ ID NO: 19, SEQ ID NO: 21, SEQ IDNO: 23, SEQ ID NO: 25, SEQ ID NO: 27, SEQ ID NO: 29, SEQ ID NO: 31 orSEQ ID NO: 33.

Ste20-like nucleic acids/genes and variants thereof may be suitable inpractising the methods of the invention. Variant Ste20-like nucleicacid/genes include portions of a Ste20-like nucleic acid/gene, splicevariants, allelic variants and/or nucleic acids capable of hybridisingwith a Ste20-like nucleic acid/gene.

The term portion as defined herein refers to a piece of DNA encoding apolypeptide comprising the Ste20 signature sequenceG(T/N)P(Y/C/R)(W/R)MAPE(V/K) (SEQ ID NO: 6) and a N-terminal Ser/Thrkinase domain as defined above. A portion may be prepared, for example,by making one or more deletions to a Ste20-like nucleic acid. Theportions may be used in isolated form or they may be fused to othercoding (or non-coding) sequences in order to, for example, produce aprotein that combines several activities. When fused to other codingsequences, the resulting polypeptide produced upon translation may bebigger than that predicted for the Ste20-like fragment. The portion istypically at least 300, 400, 500, 600 or 700 nucleotides in length,preferably at least 750, 900, 850, 900 or 950 nucleotides in length,more preferably at least 1000, 1100, 1200 or 1300 nucleotides in lengthand most preferably at least 1350, 1400 or 1450 nucleotides in length.Preferably, the portion is a portion of a nucleic acid as represented byany one of SEQ ID NO: 1, SEQ ID NO: 11, SEQ ID NO: 13, SEQ ID NO: 15,SEQ ID NO: 17, SEQ ID NO: 19, SEQ ID NO: 21, SEQ ID NO: 23, SEQ ID NO:25, SEQ ID NO: 27, SEQ ID NO: 29, SEQ ID NO: 31 or SEQ ID NO: 33. Mostpreferably the portion of a nucleic acid is as represented by SEQ ID NO:1.

The terms “fragment”, “fragment of a sequence” or “part of a sequence”“portion” or “portion thereof” mean a truncated sequence of the originalsequence referred to. The truncated sequence (nucleic acid or proteinsequence) can vary widely in length; the minimum size being a sequenceof sufficient size to provide a sequence with at least a comparablefunction and/or activity of the original sequence referred to orhybidizing with the nucleic acid molecule of the invention or used inthe process of the invention under stringend conditions, while themaximum size is not critical. In some applications, the maximum sizeusually is not substantially greater than that required to provide thedesired activity and/or function(s) of the original sequence. Acomparable function means at least 40%, 45% or 50%, preferably at least60%, 70%, 80% or 90% or more of the original sequence.

Another variant of a Ste20-like nucleic acid/gene is a nucleic acidcapable of hybridising under reduced stringency conditions, preferablyunder stringent conditions, with a Ste20-like nucleic acid/gene ashereinbefore defined or with a portion as defined hereinabove.

Hybridising sequences useful in the methods of the present inventionencode a polypeptide having a Ste20 signature sequenceG(T/N)P(Y/C/R)(W/R)MAPE(V/K) (SEQ ID NO: 6) and a N-terminal Ser/Thrkinase domain as defined above and having substantially the samebiological activity as the Ste20-like protein represented by SEQ ID NO:2 or homologues thereof. The hybridizing sequence is typically at least800 nucleotides in length, preferably at least 1000 nucleotides inlength, more preferably at least 1200 nucleotides in length and mostpreferably at least 1400 nucleotides in length.

Preferably, the hybridising sequence is one that is capable ofhybridising to a nucleic acid as represented by (or to probes derivedfrom) SEQ ID NO: 1, SEQ ID NO: 3, SEQ ID NO: 11, SEQ ID NO: 13, SEQ IDNO: 15, SEQ ID NO: 17, SEQ ID NO: 19, SEQ ID NO: 21, SEQ ID NO: 23, SEQID NO: 25, SEQ ID NO: 27, SEQ ID NO: 29, SEQ ID NO: 31 or SEQ ID NO: 33,or to a portion of any of the aforementioned sequences, a portion beingas defined above. Most preferably the hybridising sequence is capable ofhybridising to SEQ ID NO: 1, or to portions (or probes) thereof. Methodsfor designing probes are well known in the art. Probes are generallyless than 1000 bp, 900 bp, 800 bp, 700 bp, 600 bp in length, preferablyless than 500 bp, 400 bp, 300 bp 200 bp or 100 bp in length. Commonly,probe lengths for DNA-DNA hybridisations such as Southern blotting, varybetween 100 and 500 bp, whereas the hybridising region in probes forDNA-DNA hybridisations such as in PCR amplification generally areshorter than 50 but longer than 10 nucleotides, preferably they are 15,20, 25, 30, 35, 40, 45 or 50 bp in length.

The term “hybridisation” as defined herein is a process whereinsubstantially homologous complementary nucleotide sequences anneal toeach other. The hybridisation process can occur entirely in solution,i.e. both complementary nucleic acids are in solution. The hybridisationprocess can also occur with one of the complementary nucleic acidsimmobilised to a matrix such as magnetic beads, Sepharose beads or anyother resin. The hybridisation process can furthermore occur with one ofthe complementary nucleic acids immobilised to a solid support such as anitro-cellulose or nylon membrane or immobilised by e.g.photolithography to, for example, a siliceous glass support (the latterknown as nucleic acid arrays or micro-arrays or as nucleic acid chips).In order to allow hybridisation to occur, the nucleic acid molecules aregenerally thermally or chemically denatured to melt a double strand intotwo single strands and/or to remove hairpins or other secondarystructures from single stranded nucleic acids.

The term “stringency” refers to the conditions under which ahybridisation takes place. The stringency of hybridisation is influencedby conditions such as temperature, salt concentration, ionic strengthand hybridisation buffer composition. Generally, low stringencyconditions are selected to be about 30° C. lower than the thermalmelting point (T_(m)) for the specific sequence at a defined ionicstrength and pH. Medium stringency conditions are when the temperatureis 20° C. below T_(m), and high stringency conditions are when thetemperature is 10° C. below T_(m). High stringency hybridisationconditions are typically used for isolating hybridising sequences thathave high sequence similarity to the target nucleic acid sequence.However, nucleic acids may deviate in sequence and still encode asubstantially identical polypeptide, due to the degeneracy of thegenetic code. Therefore medium stringency hybridisation conditions maysometimes be needed to identify such nucleic acid molecules.

The T_(m) is the temperature under defined ionic strength and pH, atwhich 50% of the target sequence hybridises to a perfectly matchedprobe. The T_(m) is dependent upon the solution conditions and the basecomposition and length of the probe. For example, longer sequenceshybridise specifically at higher temperatures. The maximum rate ofhybridisation is obtained from about 16° C. up to 32° C. below T_(m).The presence of monovalent cations in the hybridisation solution reducethe electrostatic repulsion between the two nucleic acid strands therebypromoting hybrid formation; this effect is visible for sodiumconcentrations of up to 0.4M (for higher concentrations, this effect maybe ignored). Formamide reduces the melting temperature of DNA-DNA andDNA-RNA duplexes with 0.6 to 0.7° C. for each percent formamide, andaddition of 50% formamide allows hybridisation to be performed at 30 to45° C., though the rate of hybridisation will be lowered. Base pairmismatches reduce the hybridisation rate and the thermal stability ofthe duplexes. On average and for large probes, the T_(m) decreases about1° C. per % base mismatch. The T_(m) may be calculated using thefollowing equations, depending on the types of hybrids:

1) DNA-DNA hybrids (Meinkoth and Wahl, Anal. Biochem., 138: 267-284,1984):

T _(m)=81.5° C.+16.6×log₁₀[Na+]^(a)+0.41×%[G/C^(b)]−500×[L1^(c)]−0.61×%formamide

2) DNA-RNA or RNA-RNA hybrids:

Tm=79.8+18.5(log₁₀[Na+]^(a))+0.58(% G/C^(b))+11.8(% G/C^(b))²−820/L^(c)

3) oligo-DNA or oligo-RNA^(d) hybrids:

For <20 nucleotides: Tm=2 (l_(a))

For 20-35 nucleotides: Tm=22+1.46 (I_(n))

^(a) or for other monovalent cation, but only accurate in the 0.01-0.4 Mrange.^(b) only accurate for % GC in the 30% to 75% range.^(c) L=length of duplex in base pairs.^(d) Oligo, oligonucleotide; I_(n), effective length of primer=2×(no. ofG/C)+(no. of NT).

Non-specific binding may be controlled using any one of a number ofknown techniques such as, for example, blocking the membrane withprotein containing solutions, additions of heterologous RNA, DNA, andSDS to the hybridisation buffer, and treatment with Rnase. Fornon-homologous probes, a series of hybridizations may be performed byvarying one of (i) progressively lowering the annealing temperature (forexample from 68° C. to 42° C.) or (ii) progressively lowering theformamide concentration (for example from 50% to 0%). The skilledartisan is aware of various parameters which may be altered duringhybridisation and which will either maintain or change the stringencyconditions.

Besides the hybridisation conditions, specificity of hybridisationtypically also depends on the function of post-hybridisation washes. Toremove background resulting from non-specific hybridisation, samples arewashed with dilute salt solutions. Critical factors of such washesinclude the ionic strength and temperature of the final wash solution:the lower the salt concentration and the higher the wash temperature,the higher the stringency of the wash. Wash conditions are typicallyperformed at or below hybridisation stringency. A positive hybridisationgives a signal that is at least twice of that of the background.Generally, suitable stringent conditions for nucleic acid hybridisationassays or gene amplification detection procedures are as set forthabove. More or less stringent conditions may also be selected. Theskilled artisan is aware of various parameters which may be alteredduring washing and which will either maintain or change the stringencyconditions.

For example, typical high stringency hybridisation conditions for DNAhybrids longer than 50 nucleotides encompass hybridisation at 65° C. in1×SSC or at 42° C. in 1×SSC and 50% formamide, followed by washing at65° C. in 0.3×SSC. Examples of medium stringency hybridisationconditions for DNA hybrids longer than 50 nucleotides encompasshybridisation at 50° C. in 4×SSC or at 40° C. in 6×SSC and 50%formamide, followed by washing at 50° C. in 2×SSC. The length of thehybrid is the anticipated length for the hybridising nucleic acid. Whennucleic acids of known sequence are hybridised, the hybrid length may bedetermined by aligning the sequences and identifying the conservedregions described herein. 1×SSC is 0.15M NaCl and 15 mM sodium citrate;the hybridisations and washes may additionally include 5×Denhardt'sreagent, 0.5-1.0% SDS, 100 μg/ml denatured, fragmented salmon sperm DNA,0.5% sodium pyrophosphate.

For the purposes of defining the level of stringency, reference canconveniently be made to Sambrook et al. (2001) Molecular Cloning: alaboratory manual, 3rd Edition Cold Spring Harbor Laboratory Press, CSH,New York, or to Current Protocols in Molecular Biology, John Wiley &Sons, N.Y. (1989 and yearly updates).

Also useful in the methods of the invention are nucleic acids encodinghomologues of the amino acid sequence represented by SEQ ID NO 2.

A homologue may be in the form of a “substitutional variant” of aprotein, i.e. where at least one residue in an amino acid sequence hasbeen removed and a different residue inserted in its place. Amino acidsubstitutions are typically of single residues, but may be clustereddepending upon functional constraints placed upon the polypeptide;insertions will usually be of the order of about 1 to 10 amino acidresidues. Preferably, amino acid substitutions comprise conservativeamino acid substitutions. To produce such homologues, amino acids of theprotein may be replaced by other amino acids having similar properties(such as similar hydrophobicity, hydrophilicity, antigenicity,propensity to form or break α-helical structures or β-sheet structures).Conservative substitution tables are well known in the art (see forexample Creighton (1984) Proteins. W.H. Freeman and Company and Table 1below).

TABLE 1 Examples of conserved amino acid substitutions ResidueConservative Substitutions Ala Ser Arg Lys Asn Gln; His Asp Glu Gln AsnCys Ser Glu Asp Gly Pro His Asn; Gln Ile Leu, Val Leu Ile; Val Lys Arg;Gln Met Leu; Ile Phe Met; Leu; Tyr Ser Thr; Gly Thr Ser; Val Trp Tyr TyrTrp; Phe Val Ile; Leu

A homologue may also be in the form of an “insertional variant” of aprotein, i.e. where one or more amino acid residues are introduced intoa predetermined site in a protein. Insertions may comprise N-terminaland/or C-terminal fusions as well as intra-sequence insertions of singleor multiple amino acids. Generally, insertions within the amino acidsequence will be smaller than N- or C-terminal fusions, of the order ofabout 1 to 10 residues. Examples of N- or C-terminal fusion proteins orpeptides include the binding domain or activation domain of atranscriptional activator as used in the yeast two-hybrid system, phagecoat proteins, (histidine)-6-tag, glutathione S-transferase-tag, proteinA, maltose-binding protein, dihydrofolate reductase, Tag·100 epitope,c-myc epitope, FLAG®-epitope, lacZ, CMP (calmodulin-binding peptide), HAepitope, protein C epitope and VSV epitope.

Homologues in the form of “deletion variants” of a protein arecharacterised by the removal of one or more amino acids from a protein.

Amino acid variants of a protein (substitution-, deletion- and/orinsertion-variants) may readily be made using peptide synthetictechniques well known in the art, such as solid phase peptide synthesisand the like, or by recombinant DNA manipulations. Methods for themanipulation of DNA sequences to produce substitution, insertion ordeletion variants of a protein are well known in the art. For example,techniques for making substitution mutations at predetermined sites inDNA are well known to those skilled in the art and include M13mutagenesis, T7-Gen in vitro mutagenesis (USB, Cleveland, Ohio),QuickChange Site Directed mutagenesis (Stratagene, San Diego, Calif.),PCR-mediated site-directed mutagenesis or other site-directedmutagenesis protocols.

The Ste20-like polypeptide or homologue thereof may also be aderivative. “Derivatives” include peptides, oligopeptides, polypeptides,proteins and enzymes which may comprise substitutions, deletions oradditions of naturally and non-naturally occurring amino acid residuescompared to the amino acid sequence of a naturally-occurring form of theprotein, for example, as presented in SEQ ID NO: 2. “Derivatives”include peptides, oligopeptides, polypeptides which may, compared to theamino acid sequence of the naturally-occurring form of the protein, suchas the one presented in SEQ ID NO: 2, comprise substitutions of aminoacids with non-naturally occurring amino acid residues, or additions ofnon-naturally occurring amino acid residues. “Derivatives” of a proteinalso encompass peptides, oligopeptides, polypeptides which may comprisenaturally occurring altered (glycosylated, acylated, ubiquinated,prenylated, phosphorylated, myristoylated, sulphated etc) ornon-naturally altered amino acid residues compared to the amino acidsequence of a naturally-occurring form of the polypeptide. A derivativemay also comprise one or more non-amino acid substituents or additionscompared to the amino acid sequence from which it is derived, forexample a reporter molecule or other ligand, covalently ornon-covalently bound to the amino acid sequence, such as a reportermolecule which is bound to facilitate its detection, and non-naturallyoccurring amino acid residues relative to the amino acid sequence of anaturally-occurring protein. Derivatives of orthologues or paralogues ofSEQ ID NO: 2 are further examples which may be suitable for use in themethods of the invention.

The Ste20-like polypeptide or homologue thereof may be encoded by asplice variant of a Ste20-like nucleic acid/gene. The term “splicevariant” as used herein encompasses variants of a nucleic acid sequencein which selected introns and/or exons have been excised, replaced,displaced or added, or in which introns have been shortened orlengthened. Such variants will be ones in which the biological activityof the protein is substantially retained, this may be achieved byselectively retaining functional segments of the protein. Such splicevariants may be found in nature or may be manmade. Methods for makingsuch splice variants are known in the art. Preferred splice variants aresplice variants of the nucleic acid encoding a polypeptide comprisingthe Ste20 signature sequence (SEQ ID NO: 6) and a N-terminal Ser/Thrkinase domain as defined above. Preferably, the Ste20-like polypeptideor a homologue thereof additionally comprises SEQ ID NO: 7, morepreferably the Ste20-like polypeptide or a homologue thereof comprisesone or more of the following: SEQ ID NO: 8, SEQ ID NO: 9 and SEQ ID NO:10. Further preferred are splice variants of nucleic acids representedby SEQ ID NO: 1, SEQ ID NO: 11, SEQ ID NO: 13, SEQ ID NO: 15, SEQ ID NO:17, SEQ ID NO: 19, SEQ ID NO: 21, SEQ ID NO: 23, SEQ ID NO: 25, SEQ IDNO: 27, SEQ ID NO: 29, SEQ ID NO: 31 and SEQ ID NO: 33. Most preferredis a splice variant of the nucleic acid represented by SEQ ID NO: 1.

Another nucleic acid variant useful in the methods of the invention isan allelic variant of a nucleic acid encoding a Ste20-like polypeptideor a homologue thereof as defined above, preferably an allelic variantof a nucleic acid encoding a Ste20-like polypeptide comprising the Ste20signature sequence (SEQ ID NO: 6) and a N-terminal Ser/Thr kinasedomain. Preferably, the Ste20-like polypeptide or a homologue thereofadditionally comprises SEQ ID NO: 7, more preferably the Ste20-likepolypeptide or a homologue thereof comprises one or more of thefollowing: SEQ ID NO: 8, SEQ ID NO: 9 and SEQ ID NO: 10. Furtherpreferred are allelic variants of nucleic acids represented by SEQ IDNO: 1, SEQ ID NO: 11, SEQ ID NO: 13, SEQ ID NO: 15, SEQ ID NO: 17, SEQID NO: 19, SEQ ID NO: 21, SEQ ID NO: 23, SEQ ID NO: 25, SEQ ID NO: 27,SEQ ID NO: 29, SEQ ID NO: 31 and SEQ ID NO: 33. Most preferred is anallelic variant of a nucleic acid as represented by SEQ ID NO: 1.Allelic variants exist in nature, and encompassed within the methods ofthe present invention is the use of these natural alleles. Allelicvariants encompass Single Nucleotide Polymorphisms (SNPs), as well asSmall Insertion/Deletion Polymorphisms (INDELs). The size of INDELs isusually less than 100 bp. SNPs and INDELs form the largest set ofsequence variants in naturally occurring polymorphic strains of mostorganisms.

A further nucleic acid variant useful in the methods of the invention isa nucleic acid variant obtained by gene shuffling. Gene shuffling ordirected evolution may also be used to generate variants of Ste20-likenucleic acids. This consists of iterations of DNA shuffling followed byappropriate screening and/or selection to generate variants ofSte20-like nucleic acids or portions thereof having a modifiedbiological activity (Castle et al., (2004) Science 304(5674): 1151-4;U.S. Pat. Nos. 5,811,238 and 6,395,547).

Furthermore, site-directed mutagenesis may be used to generate variantsof Ste20-like nucleic acids. Several methods are available to achievesite-directed mutagenesis; the most common being PCR based methods(Current Protocols in Molecular Biology. Wiley Eds.).

The Ste20-like nucleic acid or variant thereof may be derived from anynatural or artificial source. The nucleic acid/gene or variant thereofmay be isolated from a microbial source, such as yeast or fungi, or froma plant, algae or animal (including human) source. This nucleic acid maybe modified from its native form in composition and/or genomicenvironment through deliberate human manipulation. The nucleic acid ispreferably of plant origin, whether from the same plant species (forexample to the one in which it is to be introduced) or whether from adifferent plant species. The nucleic acid may be isolated from adicotyledonous species, preferably from the family Brassicaceae, furtherpreferably from Arabidopsis thaliana. More preferably, the Ste20-likenucleic acid is isolated from Arabidopsis thaliana and is represented bySEQ ID NO: 1, and the Ste20-like amino acid sequences is as representedby SEQ ID NO: 2.

According to a preferred aspect of the present invention, modulated,preferably increased expression of the Ste20-like nucleic acid orvariant thereof is envisaged. Methods for increasing expression of genesor gene products are well documented in the art and include, forexample, overexpression driven by appropriate promoters, the use oftranscription enhancers or translation enhancers. Isolated nucleic acidswhich serve as promoter or enhancer elements may be introduced in anappropriate position (typically upstream) of a non-heterologous form ofa polynucleotide so as to upregulate expression of a Ste20-like nucleicacid or variant thereof. For example, endogenous promoters may bealtered in vivo by mutation, deletion, and/or substitution (see, Kmiec,U.S. Pat. No. 5,565,350; Zarling et al., PCT/US93/03868), or isolatedpromoters may be introduced into a plant cell in the proper orientationand distance from a gene of the present invention so as to control theexpression of the gene. Methods for reducing the expression of genes orgene products are well documented in the art.

The expression of a nucleic acid encoding a Ste20-like polypeptide or ahomologue thereof may be modulated by introducing a genetic modification(preferably in the locus of a Ste20-like gene). The locus of a gene asdefined herein is taken to mean a genomic region, which includes thegene of interest and 10 kb up- or down stream of the coding region.

The genetic modification may be introduced by, for example, T-DNAactivation, TILLING, or homologous recombination. Following introductionof the genetic modification, there follows a step of selecting formodified expression of a nucleic acid encoding a Ste20-like polypeptideor a homologue thereof, which modification in expression gives plantshaving increased yield.

T-DNA activation tagging (Hayashi et al. Science (1992) 1350-1353)involves insertion of T-DNA, usually containing a promoter (may also bea translation enhancer or an intron), in the genomic region of the geneof interest or 10 kb up- or down stream of the coding region of a genein a configuration such that the promoter directs expression of thetargeted gene. Typically, regulation of expression of the targeted geneby its natural promoter is disrupted and the gene falls under thecontrol of the newly introduced promoter. The promoter is typicallyembedded in a T-DNA. This T-DNA is randomly inserted into the plantgenome, for example, through Agrobacterium infection and leads tooverexpression of genes near the inserted T-DNA. The resultingtransgenic plants show dominant phenotypes due to overexpression ofgenes close to the introduced promoter. The promoter to be introducedmay be any promoter capable of directing expression of a gene in thedesired organism, in this case a plant. For example, constitutive,tissue-preferred, cell type-preferred and inducible promoters are allsuitable for use in T-DNA activation.

A genetic modification may also be introduced in the locus of aSte20-like gene using the technique of TILLING (Targeted Induced LocalLesions In Genomes). This is a mutagenesis technology useful to generateand/or identify (and to eventually isolate) mutagenised variants of aSte20-like nucleic acid with modulated expression and/or activity.TILLING also allows selection of plants carrying such mutant variants.These mutant variants may exhibit modified expression, either instrength or in location or in timing (if the mutations affect thepromoter for example). These mutant variants may even exhibit higherSte20-like activity than that exhibited by the gene in its natural form.TILLING combines high-density mutagenesis with high-throughput screeningmethods. The steps typically followed in TILLING are: (a) EMSmutagenesis (Redei GP and Koncz C (1992) In Methods in ArabidopsisResearch, Koncz C, Chua N H, Schell J, eds. Singapore, World ScientificPublishing Co, pp. 16-82; Feldmann et al., (1994) In Meyerowitz E M,Somerville C R, eds, Arabidopsis. Cold Spring Harbor Laboratory Press,Cold Spring Harbor, N.Y., pp 137-172; Lightner J and Caspar T (1998) InJ Martinez-Zapater, J Salinas, eds, Methods on Molecular Biology, Vol.82. Humana Press, Totowa, N.J., pp 91-104); (b) DNA preparation andpooling of individuals; (c) PCR amplification of a region of interest;(d) denaturation and annealing to allow formation of heteroduplexes; (e)DHPLC, where the presence of a heteroduplex in a pool is detected as anextra peak in the chromatogram; (f) identification of the mutantindividual; and (g) sequencing of the mutant PCR product. Methods forTILLING are well known in the art (McCallum et al., (2000) NatBiotechnol 18: 455-457; reviewed by Stemple (2004) Nat Rev Genet 5(2):145-50).

T-DNA activation and TILLING are examples of technologies that enablethe generation of novel alleles and Ste20-like variants.

The effects of the invention may also be reproduced using homologousrecombination, which allows introduction in a genome of a selectednucleic acid at a defined selected position. Homologous recombination isa standard technology used routinely in biological sciences for lowerorganisms such as yeast or the moss Physcomitrella. Methods forperforming homologous recombination in plants have been described notonly for model plants (Offring a et al. (1990) EMBO J 9(10): 3077-84)but also for crop plants, for example rice (Terada et al. (2002) NatBiotech 20(10): 1030-4; lida and Terada (2004) Curr Opin Biotech15(2):132-8). The nucleic acid to be targeted (which may be a Ste20-likenucleic acid or variant thereof as hereinbefore defined) need not betargeted to the locus of a Ste20-like gene, but may be introduced in,for example, regions of high expression. The nucleic acid to be targetedmay be an improved allele used to replace the endogenous gene or may beintroduced in addition to the endogenous gene.

A preferred method for introducing a genetic modification (which in thiscase need not be in the locus of a Ste20-like gene) is to introduce andexpress in a plant a nucleic acid encoding a Ste20-like polypeptide or ahomologue thereof, as defined above. The nucleic acid to be introducedinto a plant may be a full-length nucleic acid or may be a portion or ahybridising sequence as hereinbefore defined.

The invention also provides genetic constructs and vectors to facilitateintroduction and/or expression of the nucleotide sequences useful in themethods according to the invention.

Therefore, there is provided a gene construct comprising:

-   -   (i) a Ste20-like nucleic acid or variant thereof, as defined        hereinabove;    -   (ii) one or more control sequences capable of driving expression        of the nucleic acid sequence of (i); and optionally    -   (iii) a transcription termination sequence.

Constructs useful in the methods according to the present invention maybe constructed using recombinant DNA technology well known to personsskilled in the art. The gene constructs may be inserted into vectors,which may be commercially available, suitable for transforming intoplants and suitable for expression of the gene of interest in thetransformed cells.

Plants are transformed with a vector comprising the sequence of interest(i.e., a nucleic acid encoding a Ste20-like polypeptide or homologuethereof). The skilled artisan is well aware of the genetic elements thatmust be present on the vector in order to successfully transform, selectand propagate host cells containing the sequence of interest. Thesequence of interest is operably linked to one or more control sequences(at least to a promoter). The terms “regulatory element”, “controlsequence” and “promoter” are all used interchangeably herein and are tobe taken in a broad context to refer to regulatory nucleic acidsequences capable of effecting expression of the sequences to which theyare ligated. The term “promoter” typically refers to a nucleic acidcontrol sequence located upstream from the transcriptional start of agene and which is involved in recognising and binding of RNA polymeraseand other proteins, thereby directing transcription of an operablylinked nucleic acid. Encompassed by the aforementioned terms aretranscriptional regulatory sequences derived from a classical eukaryoticgenomic gene (including the TATA box which is required for accuratetranscription initiation, with or without a CCAAT box sequence) andadditional regulatory elements (i.e. upstream activating sequences,enhancers and silencers) which alter gene expression in response todevelopmental and/or external stimuli, or in a tissue-specific manner.Also included within the term is a transcriptional regulatory sequenceof a classical prokaryotic gene, in which case it may include a −35 boxsequence and/or −10 box transcriptional regulatory sequences. The term“regulatory element” also encompasses a synthetic fusion molecule orderivative that confers, activates or enhances expression of a nucleicacid molecule in a cell, tissue or organ. The term “operably linked” asused herein refers to a functional linkage between the promoter sequenceand the gene of interest, such that the promoter sequence is able toinitiate transcription of the gene of interest.

Suitable promoters, which are functional in plants, are generally known.They may take the form of constitutive or inducible promoters. Suitablepromoters can enable the development- and/or tissue-specific expressionin multi-celled eukaryotes; thus, leaf-, root-, flower-, seed-,stomata-, tuber- or fruit-specific promoters may advantageously be usedin plants.

Different plant promoters usable in plants are promoters such as, forexample, the USP, the LegB4-, the DC3 promoter or the ubiquitin promoterfrom parsley.

A “plant” promoter comprises regulatory elements, which mediate theexpression of a coding sequence segment in plant cells. Accordingly, aplant promoter need not be of plant origin, but may originate fromviruses or microorganisms, in particular for example from viruses whichattack plant cells.

The “plant” promoter can also originates from a plant cell, e.g. fromthe plant, which is transformed with the nucleic acid sequence to beexpressed in the inventive process and described herein. This alsoapplies to other “plant” regulatory signals, for example in “plant”terminators.

For expression in plants, the nucleic acid molecule must, as describedabove, be linked operably to or comprise a suitable promoter whichexpresses the gene at the right point in time and in a cell- ortissue-specific manner. Usable promoters are constitutive promoters(Benfey et al., EMBO J. 8 (1989) 2195-2202), such as those whichoriginate from plant viruses, such as 35S CAMV (Franck et al., Cell 21(1980) 285-294), 19S CaMV (see also U.S. Pat. No. 5,352,605 and WO84/02913), 34S FMV (Sanger et al., Plant. Mol. Biol., 14, 1990:433-443), the parsley ubiquitin promoter, or plant promoters such as theRubisco small subunit promoter described in U.S. Pat. No. 4,962,028 orthe plant promoters PRP1 [Ward et al., Plant. Mol. Biol. 22 (1993)],SSU, PGEL1, OCS [Leisner (1988) Proc Natl Acad Sci USA 85(5):2553-2557], lib4, usp, mas [Comai (1990) Plant Mol Biol 15 (3):373-381],STLS1, ScBV (Schenk (1999) Plant Mol Biol 39(6):1221-1230), B33, SAD1 orSAD2 (flax promoters, Jain et al., Crop Science, 39 (6), 1999:1696-1701) or nos [Shaw et al. (1984) Nucleic Acids Res.12(20):7831-7846]. Further examples of constitutive plant promoters arethe sugarbeet V-ATPase promoters (WO 01/14572). Examples of syntheticconstitutive promoters are the Super promoter (WO 95/14098) andpromoters derived from G-boxes (WO 94/12015). If appropriate, chemicalinducible promoters may furthermore also be used, compare EP-A 388186,EP-A 335528, WO 97/06268. Stable, constitutive expression of theproteins according to the invention a plant can be advantageous.However, inducible expression of the polypeptide of the invention isadvantageous, if a late expression before the harvest is of advantage,as metabolic manipulation may lead to plant growth retardation.

The expression of plant genes can also be facilitated via a chemicalinducible promoter (for a review, see Gatz 1997, Annu. Rev. PlantPhysiol. Plant Mol. Biol., 48:89-108). Chemically inducible promotersare particularly suitable when it is desired to express the gene in atime-specific manner. Examples of such promoters are a salicylic acidinducible promoter (WO 95/19443), and abscisic acid-inducible promoter(EP 335 528), a tetracyclin-inducible promoter (Gatz et al. (1992) PlantJ. 2, 397-404), a cyclohexanol- or ethanol-inducible promoter (WO93/21334) or others as described herein.

Other suitable promoters are those which react to biotic or abioticstress conditions, for example the pathogen-induced PRP1 gene promoter(Ward et al., Plant. Mol. Biol. 22 (1993) 361-366), the tomatoheat-inducible hsp80 promoter (U.S. Pat. No. 5,187,267), the potatochill-inducible alpha-amylase promoter (WO 96/12814) or thewound-inducible pinll promoter (EP-A-0 375 091) or others as describedherein.

Preferred promoters are in particular those which bring gene expressionin tissues and organs, in seed cells, such as endosperm cells and cellsof the developing embryo. Suitable promoters are the oilseed rape napingene promoter (U.S. Pat. No. 5,608,152), the Vicia faba USP promoter(Baeumlein et al., Mol Gen Genet, 1991, 225 (3): 459-67), theArabidopsis oleosin promoter (WO 98/45461), the Phaseolus vulgarisphaseolin promoter (U.S. Pat. No. 5,504,200), the Brassica Bce4 promoter(WO 91/13980), the bean arc5 promoter, the carrot DcG3 promoter, or theLegumin B4 promoter (LeB4; Baeumlein et al., 1992, Plant Journal, 2 (2):233-9), and promoters which bring about the seed-specific expression inmonocotyledonous plants such as maize, barley, wheat, rye, rice and thelike. Advantageous seed-specific promoters are the sucrose bindingprotein promoter (WO 00/26388), the phaseolin promoter and the napinpromoter. Suitable promoters which must be considered are the barleyIpt2 or Ipt1 gene promoter (WO 95/15389 and WO 95/23230), and thepromoters described in WO 99/16890 (promoters from the barley hordeingene, the rice glutelin gene, the rice oryzin gene, the rice prolamingene, the wheat gliadin gene, the wheat glutelin gene, the maize zeingene, the oat glutelin gene, the sorghum kasirin gene and the ryesecalin gene). Further suitable promoters are Amy32b, Amy 6-6 andAleurain [U.S. Pat. No. 5,677,474], Bce4 (oilseed rape) [U.S. Pat. No.5,530,149], glycinin (soya) [EP 571 741], phosphoenolpyruvatecarboxylase (soya) [JP 06/62870], ADR12-2 (soya) [WO 98/08962],isocitrate lyase (oilseed rape) [U.S. Pat. No. 5,689,040] or α-amylase(barley) [EP 781 849]. Other promoters which are available for theexpression of genes in plants are leaf-specific promoters such as thosedescribed in DE-A 19644478 or light-regulated promoters such as, forexample, the pea petE promoter.

Further suitable plant promoters are the cytosolic FBPase promoter orthe potato ST-LSI promoter (Stockhaus et al., EMBO J. 8, 1989, 2445),the Glycine max phosphoribosylpyrophosphate amidotransferase promoter(GenBank Accession No. U87999) or the node-specific promoter describedin EP-A-0 249 676.

Advantageously, any type of promoter may be used to drive expression ofthe nucleic acid sequence. The promoter may be an inducible promoter,i.e. having induced or increased transcription initiation in response toa chemical, environmental or physical stimulus. An example of aninducible promoter is a stress-inducible promoter, i.e. a promoteractivated when a plant is exposed to various stress conditions, or apathogen-induced promoter. Additionally or alternatively, the promotermay be a tissue-preferred promoter, i.e. one that is capable ofpreferentially initiating transcription in certain tissues, such as theleaves, roots, seed tissue etc; or may be a ubiquitous promoter, whichis active in substantially all tissues or cells of an organism, or thepromoter may be developmentally regulated, thereby being active duringcertain developmental stages or in parts of the plant that undergodevelopmental changes. Promoters able to initiate transcription incertain tissues only are referred to herein as “tissue-specific”,similarly, promoters able to initiate transcription in certain cellsonly are referred to herein as “cell-specific”.

Preferably, the Ste20-like nucleic acid or variant thereof is operablylinked to a constitutive promoter. A constitutive promoter istranscriptionally active during most, but not necessarily all, phases ofits growth and development and under most environmental conditions in atleast one cell, tissue or organ. A preferred constitutive promoter is aconstitutive promoter that is also substantially ubiquitously expressed.Further preferably the promoter is derived from a plant, more preferablya monocotyledonous plant. Most preferred is use of a GOS2 promoter (fromrice) (as used in the expression cassette of SEQ ID NO: 5). It should beclear that the applicability of the present invention is not restrictedto the Ste20-like nucleic acid represented by SEQ ID NO: 1, nor is theapplicability of the invention restricted to expression of a nucleicacid encoding a Ste20-like protein when driven by a GOS2 promoter.Examples of other constitutive promoters which may also be used to driveexpression of a nucleic acid encoding a Ste20-like protein are shown inTable 2 below.

TABLE 2 Examples of constitutive promoters Expression Gene SourcePattern Reference Actin Constitutive McElroy et al, Plant Cell, 2:163-171, 1990 CAMV 35S Constitutive Odell et al, Nature, 313: 810-812,1985 CaMV 19S Constitutive Nilsson et al., Physiol. Plant. 100: 456-462,1997 GOS2 Constitutive de Pater et al, Plant J Nov; 2(6): 837-44, 1992Ubiquitin Constitutive Christensen et al, Plant Mol. Biol. 18: 675-689,1992 Rice Constitutive Buchholz et al, Plant Mol Biol. 25(5):cyclophilin 837-43, 1994 Maize H3 Constitutive Lepetit et al, Mol. Gen.Genet. 231: histone 276-285, 1992 Actin 2 Constitutive An et al, PlantJ. 10(1); 107-121, 1996

Optionally, one or more terminator sequences (also a control sequence)may be used in the construct introduced into a plant. The term“terminator” encompasses a control sequence which is a DNA sequence atthe end of a transcriptional unit which signals 3′ processing andpolyadenylation of a primary transcript and termination oftranscription. The terminator can be derived from the natural gene, froma variety of other plant genes, or from T-DNA. The terminator to beadded may be derived from, for example, the nopaline synthase oroctopine synthase genes, or alternatively from another plant gene, lesspreferably from any other eukaryotic gene. Additional regulatoryelements may include transcriptional as well as translational enhancers.Those skilled in the art will be aware of terminator and enhancersequences that may be suitable for use in performing the invention. Suchsequences would be known or may readily be obtained by a person skilledin the art.

An intron sequence may also be added to the 5′ untranslated region orthe coding sequence of the partial coding sequence to increase theamount of the mature message that accumulates in the cytosol. Inclusionof a spliceable intron in the transcription unit in both plant andanimal expression constructs has been shown to increase gene expressionat both the mRNA and protein levels up to 1000-fold, Buchman and Berg,Mol. Cell biol. 8:4395-4405 (1988); Callis et al., Genes Dev.1:1183-1200 (1987). Such intron enhancement of gene expression istypically greatest when placed near the 5′ end of the transcriptionunit. Use of the maize introns Adh1-S intron 1, 2, and 6, the Bronze-1intron are known in the art. See generally, The Maize Handbook, Chapter116, Freeling and Walbot, Eds., Springer, N.Y. (1994).

Other control sequences (besides promoter, enhancer, silencer, intronsequences, 3′UTR and/or 5′UTR regions) may be protein and/or RNAstabilizing elements. Such sequences would be known or may readily beobtained by a person skilled in the art.

The genetic constructs of the invention may further include an origin ofreplication sequence that is required for maintenance and/or replicationin a specific cell type. One example is when a genetic construct isrequired to be maintained in a bacterial cell as an episomal geneticelement (e.g. plasmid or cosmid molecule). Preferred origins ofreplication include, but are not limited to, the f1-ori and colE1.

For the detection and/or selection of the successful transfer of thenucleic acid sequences as depicted in the sequence protocol and used inthe process of the invention, it is advantageous to use marker genes(=reporter genes). These marker genes enable the identification of asuccessful transfer of the nucleic acid molecules via a series ofdifferent principles, for example via visual identification with the aidof fluorescence, luminescence or in the wavelength range of light whichis discernible for the human eye, by a resistance to herbicides orantibiotics, via what are known as nutritive markers (auxotrophismmarkers) or antinutritive markers, via enzyme assays or viaphytohormones. Examples of such markers which may be mentioned are GFP(=green fluorescent protein); the luciferin/luceferase system, theβ-galactosidase with its colored substrates, for example X-Gal, theherbicide resistances to, for example, imidazolinone, glyphosate,phosphinothricin or sulfonylurea, the antibiotic resistances to, forexample, bleomycin, hygromycin, streptomycin, kanamycin, tetracyclin,chloramphenicol, ampicillin, gentamycin, geneticin (G418), spectinomycinor blasticidin, to mention only a few, nutritive markers such as theutilization of mannose or xylose, or antinutritive markers such as theresistance to 2-deoxyglucose. This list is a small number of possiblemarkers. The skilled worker is very familiar with such markers.Different markers are preferred, depending on the organism and theselection method.

Therefore the genetic construct may optionally comprise a selectablemarker gene. As used herein, the term “selectable marker” or “selectablemarker gene” includes any gene that confers a phenotype on a cell inwhich it is expressed to facilitate the identification and/or selectionof cells that are transfected or transformed with a nucleic acidconstruct of the invention. Suitable markers may be selected frommarkers that confer antibiotic or herbicide resistance, that introduce anew metabolic trait or that allow visual selection. Examples ofselectable marker genes include genes conferring resistance toantibiotics (such as nptll that phosphorylates neomycin and kanamycin,or hpt, phosphorylating hygromycin), to herbicides (for example barwhich provides resistance to Basta; aroA or gox providing resistanceagainst glyphosate), or genes that provide a metabolic trait (such asmanA that allows plants to use mannose as sole carbon source). Visualmarker genes result in the formation of colour (for exampleβ-glucuronidase, GUS), luminescence (such as luciferase) or fluorescence(Green Fluorescent Protein, GFP, and derivatives thereof).

It is known of the stable or transient integration of nucleic acids intoplant cells that only a minority of the cells takes up the foreign DNAand, if desired, integrates it into its genome, depending on theexpression vector used and the transfection technique used. To identifyand select these integrants, a gene encoding for a selectable marker (asdescribed above, for example resistance to antibiotics) is usuallyintroduced into the host cells together with the gene of interest.Preferred selectable markers in plants comprise those, which conferresistance to an herbicide such as glyphosate or gluphosinate. Othersuitable markers are, for example, markers, which encode genes involvedin biosynthetic pathways of, for example, sugars or amino acids, such asβ-galactosidase, ura3 or ilv2. Markers, which encode genes such asluciferase, gfp or other fluorescence genes, are likewise suitable.These markers and the aforementioned markers can be used in mutants inwhom these genes are not functional since, for example, they have beendeleted by conventional methods. Furthermore, nucleic acid molecules,which encode a selectable marker, can be introduced into a host cell onthe same vector as those, which encode the polypeptides of the inventionor used in the process or else in a separate vector. Cells which havebeen transfected stably with the nucleic acid introduced can beidentified for example by selection (for example, cells which haveintegrated the selectable marker survive whereas the other cells die).

Since the marker genes, as a rule specifically the gene for resistanceto antibiotics and herbicides, are no longer required or are undesiredin the transgenic host cell once the nucleic acids have been introducedsuccessfully, the process according to the invention for introducing thenucleic acids advantageously employs techniques which enable theremoval, or excision, of these marker genes. One such a method is whatis known as cotransformation. The cotransformation method employs twovectors simultaneously for the transformation, one vector bearing thenucleic acid according to the invention and a second bearing the markergene(s). A large proportion of transformants receives or, in the case ofplants, comprises (up to 40% of the transformants and above), bothvectors. In case of transformation with Agrobacteria, the transformantsusually receive only a part of the vector, the sequence flanked by theT-DNA, which usually represents the expression cassette. The markergenes can subsequently be removed from the transformed plant byperforming crosses. In another method, marker genes integrated into atransposon are used for the transformation together with desired nucleicacid (known as the Ac/Ds technology). The transformants can be crossedwith a transposase resource or the transformants are transformed with anucleic acid construct conferring expression of a transposase,transiently or stable. In some cases (approx. 10%), the transposon jumpsout of the genome of the host cell once transformation has taken placesuccessfully and is lost. In a further number of cases, the transposonjumps to a different location. In these cases, the marker gene must beeliminated by performing crosses. In microbiology, techniques weredeveloped which make possible, or facilitate, the detection of suchevents. A further advantageous method relies on what are known asrecombination systems; whose advantage is that elimination by crossingcan be dispensed with. The best-known system of this type is what isknown as the Cre/lox system. Cre1 is a recombinase, which removes thesequences located between the loxP sequences. If the marker gene isintegrated between the loxP sequences, it is removed, oncetransformation has taken place successfully, by expression of therecombinase. Further recombination systems are the HIN/HIX, FLP/FRT andREP/STB system (Tribble et al., J. Biol. Chem., 275, 2000: 22255-22267;Velmurugan et al., J. Cell Biol., 149, 2000: 553-566). A site-specificintegration into the plant genome of the nucleic acid sequencesaccording to the invention is possible. Naturally, these methods canalso be applied to microorganisms such as yeast, fungi or bacteria.

The present invention also encompasses plants obtainable by the methodsaccording to the present invention. The present invention thereforeprovides plants obtainable by the method according to the presentinvention, which plants have introduced therein a Ste20-like nucleicacid or variant thereof.

The invention also provides a method for the production of transgenicplants having increased yield, comprising introduction and expression ina plant of a Ste20-like nucleic acid or a variant thereof as definedabove.

For the purposes of the invention, “transgenic”, “transgene” or“recombinant” means with regard to, for example, a nucleic acidsequence, an expression cassette (=gene construct) or a vectorcomprising the nucleic acid sequence or an organism transformed with thenucleic acid sequences, expression cassettes or vectors according to theinvention, all those constructions brought about by recombinant methodsin which either

-   a) the nucleic acid sequences according to the invention, or-   b) genetic control sequences which is operably linked with the    nucleic acid sequence according to the invention, for example a    promoter, or-   c) a) and b)    are not located in their natural genetic environment or have been    modified by recombinant methods, it being possible for the    modification to take the form of, for example, a substitution,    addition, deletion, inversion or insertion of one or more nucleotide    residues. The natural genetic environment is understood as meaning    the natural genomic or chromosomal locus in the original plant or    the presence in a genomic library. In the case of a genomic library,    the natural genetic environment of the nucleic acid sequence is    preferably retained, at least in part. The environment flanks the    nucleic acid sequence at least on one side and has a sequence length    of at least 50 bp, preferably at least 500 bp, especially preferably    at least 1000 bp, most preferably at least 5000 bp. A naturally    occurring expression cassette—for example the naturally occurring    combination of the natural promoter of the nucleic acid sequences    with the corresponding nucleic acid sequence encoding a polypeptide    having kinase domains or a homologue of such polypeptide—becomes a    transgenic expression cassette when this expression cassette is    modified by non-natural, synthetic (“artificial”) methods such as,    for example, mutagenic treatment. Suitable methods are described,    for example, in U.S. Pat. No. 5,565,350 or WO 00/15815.

A transgenic plant for the purposes of the invention is thereforeunderstood as meaning, as above, that the nucleic acids used in themethod of the invention are not at their natural locus in the genome ofsaid plant, it being possible for the nucleic acids to be expressedhomologously or heterologously. However, as mentioned, transgenic alsomeans that, while the nucleic acids according to the invention or usedin the inventive method are at their natural position in the genome of aplant, the sequence has been modified with regard to the naturalsequence, and/or that the regulatory sequences of the natural sequenceshave been modified. Transgenic is preferably understood as meaning theexpression of the nucleic acids according to the invention at anunnatural locus in the genome, i.e. homologous or, preferably,heterologous expression of the nucleic acids takes place. Preferredtransgenic plants are mentioned herein.

Host plants for the nucleic acids or the vector used in the methodaccording to the invention, the expression cassette or construct orvector are, in principle, advantageously all plants, which are capableof synthesizing the polypeptides used in the inventive method.

More specifically, the present invention provides a method for theproduction of transgenic plants having increased yield, which methodcomprises:

-   -   introducing and expressing in a plant cell a Ste20-like nucleic        acid or variant thereof; and    -   (ii) cultivating the plant cell under conditions promoting plant        growth and development.        The nucleic acid may be introduced directly into a plant cell or        into the plant itself (including introduction into a tissue,        organ or any other part of a plant). According to a preferred        feature of the present invention, the nucleic acid is preferably        introduced into a plant by transformation.

The term “introduction” or “transformation” as referred to hereinencompasses the transfer of an exogenous polynucleotide into a hostcell, irrespective of the method used for transfer. Plant tissue capableof subsequent clonal propagation, whether by organogenesis orembryogenesis, may be transformed with a genetic construct of thepresent invention and a whole plant regenerated there from. Theparticular tissue chosen will vary depending on the clonal propagationsystems available for, and best suited to, the particular species beingtransformed. Exemplary tissue targets include leaf disks, pollen,embryos, cotyledons, hypocotyls, megagametophytes, callus tissue,existing meristematic tissue (e.g., apical meristem, axillary buds, androot meristems), and induced meristem tissue (e.g., cotyledon meristemand hypocotyl meristem). The polynucleotide may be transiently or stablyintroduced into a host cell and may be maintained non-integrated, forexample, as a plasmid. Alternatively, it may be integrated into the hostgenome. The resulting transformed plant cell may then be used toregenerate a transformed plant in a manner known to persons skilled inthe art.

The transfer of foreign genes into the genome of a plant is calledtransformation. In doing this the methods described for thetransformation and regeneration of plants from plant tissues or plantcells are utilized for transient or stable transformation. Anadvantageous transformation method is the transformation in planta. Tothis end, it is possible, for example, to allow the agrobacteria to acton plant seeds or to inoculate the plant meristem with agrobacteria. Ithas proved particularly expedient in accordance with the invention toallow a suspension of transformed agrobacteria to act on the intactplant or at least the flower primordia. The plant is subsequently grownon until the seeds of the treated plant are obtained (Clough and Bent,Plant J. (1998) 16, 735-743). To select transformed plants, the plantmaterial obtained in the transformation is, as a rule, subjected toselective conditions so that transformed plants can be distinguishedfrom untransformed plants. For example, the seeds obtained in theabove-described manner can be planted and, after an initial growingperiod, subjected to a suitable selection by spraying. A furtherpossibility consists in growing the seeds, if appropriate aftersterilization, on agar plates using a suitable selection agent so thatonly the transformed seeds can grow into plants. Further advantageoustransformation methods, in particular for plants, are known to theskilled worker and are described herein below.

Transformation of plant species is now a fairly routine technique.Advantageously, any of several transformation methods may be used tointroduce the gene of interest into a suitable ancestor cell.Transformation methods include the use of liposomes, electroporation,chemicals that increase free DNA uptake, injection of the DNA directlyinto the plant, particle gun bombardment, transformation using virusesor pollen and microprojection. Methods may be selected from thecalcium/polyethylene glycol method for protoplasts (Krens, F. A. et al.,(1982) Nature 296, 72-74; Negrutiu I et al. (1987) Plant Mol Biol 8:363-373); electroporation of protoplasts (Shillito R. D. et al. (1985)Bio/Technol 3, 1099-1102); microinjection into plant material (CrosswayA et al., (1986) Mol. Gen Genet 202: 179-185); DNA or RNA-coatedparticle bombardment (Klein T M et al., (1987) Nature 327: 70) infectionwith (non-integrative) viruses and the like. Transgenic rice plantsexpressing a Ste20-like nucleic acid/gene are preferably produced viaAgrobacterium-mediated transformation using any of the well knownmethods for rice transformation, such as described in any of thefollowing: published European patent application EP 1198985 A1, Aldemitaand Hodges (Planta 199: 612-617, 1996); Chan et al. (Plant Mol Biol 22(3): 491-506, 1993), Hiei et al. (Plant J 6 (2): 271-282, 1994), whichdisclosures are incorporated by reference herein as if fully set forth.In the case of corn transformation, the preferred method is as describedin either Ishida et al. (Nat. Biotechnol 14(6): 745-50, 1996) or Frameet al. (Plant Physiol 129(1): 13-22, 2002), which disclosures areincorporated by reference herein as if fully set forth. Said methods arefurther described by way of example in B. Jenes et al., Techniques forGene Transfer, in: Transgenic Plants, Vol. 1, Engineering andUtilization, eds. S. D. Kung and R. Wu, Academic Press (1993) 128-143and in Potrykus Annu. Rev. Plant Physiol. Plant Molec. Biol. 42 (1991)205-225). The nucleic acids or the construct to be expressed ispreferably cloned into a vector, which is suitable for transformingAgrobacterium tumefaciens, for example pBin19 (Bevan et al., Nucl. AcidsRes. 12 (1984) 8711). Agrobacteria transformed by such a vector can thenbe used in known manner for the transformation of plants, in particularof crop plants such as by way of example tobacco plants, for example bybathing bruised leaves or chopped leaves in an agrobacterial solutionand then culturing them in suitable media. The transformation of plantsby means of Agrobacterium tumefaciens is described, for example, byHöfgen and Willmitzer in Nucl. Acid Res. (1988) 16, 9877 or is knowninter alia from F. F. White, Vectors for Gene Transfer in Higher Plants;in Transgenic Plants, Vol. 1, Engineering and Utilization, eds. S. D.Kung and R. Wu, Academic Press, 1993, pp. 15-38.

Generally after transformation, plant cells or cell groupings areselected for the presence of one or more markers which are encoded byplant-expressible genes co-transferred with the gene of interest,following which the transformed material is regenerated into a wholeplant.

As mentioned Agrobacteria transformed with an expression vectoraccording to the invention may also be used in the manner known per sefor the transformation of plants such as experimental plants likeArabidopsis or crop plants, such as, for example, cereals, maize, oats,rye, barley, wheat, soya, rice, cotton, sugarbeet, canola, sunflower,flax, hemp, potato, tobacco, tomato, carrot, bell peppers, oilseed rape,tapioca, cassaya, arrow root, tagetes, alfalfa, lettuce and the varioustree, nut, and grapevine species, in particular oil-containing cropplants such as soya, peanut, castor-oil plant, sunflower, maize, cotton,flax, oilseed rape, coconut, oil palm, safflower (Carthamus tinctorius)or cocoa beans, for example by bathing scarified leaves or leaf segmentsin an agrobacterial solution and subsequently growing them in suitablemedia.

In addition to the transformation of somatic cells, which then has to beregenerated into intact plants, it is also possible to transform thecells of plant meristems and in particular those cells which developinto gametes. In this case, the transformed gametes follow the naturalplant development, giving rise to transgenic plants. Thus, for example,seeds of Arabidopsis are treated with agrobacteria and seeds areobtained from the developing plants of which a certain proportion istransformed and thus transgenic [Feldman, KA and Marks MD (1987). MolGen Genet 208:274-289; Feldmann K (1992). In: C Koncz, N-H Chua and JShell, eds, Methods in Arabidopsis Research. Word Scientific, Singapore,pp. 274-289]. Alternative methods are based on the repeated removal ofthe influorescences and incubation of the excision site in the center ofthe rosette with transformed agrobacteria, whereby transformed seeds canlikewise be obtained at a later point in time (Chang (1994). Plant J. 5:551-558; Katavic (1994). Mol Gen Genet, 245: 363-370). However, anespecially effective method is the vacuum infiltration method with itsmodifications such as the “floral dip” method. In the case of vacuuminfiltration of Arabidopsis, intact plants under reduced pressure aretreated with an agrobacterial suspension [Bechthold, N (1993). C R AcadSci Paris Life Sci, 316: 1194-1199], while in the case of the“floraldip” method the developing floral tissue is incubated briefly with asurfactant-treated agrobacterial suspension [Clough, SJ and Bent, AF(1998). The Plant J. 16, 735-743]. A certain proportion of transgenicseeds are harvested in both cases, and these seeds can be distinguishedfrom nontransgenic seeds by growing under the above-described selectiveconditions. In addition the stable transformation of plastids is ofadvantages because plastids are inherited maternally is most cropsreducing or eliminating the risk of transgene flow through pollen. Thetransformation of the chloroplast genome is generally achieved by aprocess, which has been schematically displayed in Klaus et al., 2004[Nature Biotechnology 22 (2), 225-229]. Briefly the sequences to betransformed are cloned together with a selectable marker gene betweenflanking sequences homologous to the chloroplast genome. Thesehomologous flanking sequences direct site specific integration into theplastome. Plastidal transformation has been described for many differentplant species and an overview can be taken from Bock (2001) Transgenicplastids in basic research and plant biotechnology. J Mol Biol. 2001Sep. 21; 312 (3):425-38 or Maliga, P (2003) Progress towardscommercialization of plastid transformation technology. TrendsBiotechnol. 21, 20-28. Further biotechnological progress has recentlybeen reported in form of marker free plastid transformants, which can beproduced by a transient cointegrated maker gene (Klaus et al., 2004,Nature Biotechnology 22(2), 225-229).

The genetically modified plant cells can be regenerated via all methodswith which the skilled worker is familiar. Suitable methods can be foundin the above-mentioned publications by S. D. Kung and R. Wu, Potrykus orHöfgen and Willmitzer.

Following DNA transfer and regeneration, putatively transformed plantsmay be evaluated, for instance using Southern analysis, for the presenceof the gene of interest, copy number and/or genomic organisation.Alternatively or additionally, expression levels of the newly introducedDNA may be monitored using Northern and/or Western analysis, orquantitiative PCR, all techniques being well known to persons havingordinary skill in the art.

The generated transformed plants may be propagated by a variety ofmeans, such as by clonal propagation or classical breeding techniques.For example, a first generation (or T1) transformed plant may be selfedand homozygous second-generation (or T2) transformants selected, and theT2 plants may then further be propagated through classical breedingtechniques.

The generated transformed organisms may take a variety of forms. Forexample, they may be chimeras of transformed cells and non-transformedcells; clonal transformants (e.g., all cells transformed to contain theexpression cassette); grafts of transformed and untransformed tissues(e.g., in plants, a transformed rootstock grafted to an untransformedscion).

The present invention clearly extends to any plant cell or plantproduced by any of the methods described herein, and to all plant partsand propagules thereof. The present invention extends further toencompass the progeny of a primary transformed or transfected cell,tissue, organ or whole plant that has been produced by any of theaforementioned methods, the only requirement being that progeny exhibitthe same genotypic and/or phenotypic characteristic(s) as those producedby the parent in the methods according to the invention. The inventionalso includes host cells containing an isolated Ste20-like nucleic acidor variant thereof. Preferred host cells according to the invention areplant cells. The invention also extends to harvestable parts of a plantsuch as, but not limited to seeds, leaves, fruits, flowers, stems,rhizomes, tubers and bulbs. The invention furthermore relates toproducts directly derived from a harvestable part of such a plant, suchas dry pellets or powders, oil, fat and fatty acids, starch or proteins.

The present invention also encompasses use of Ste20-like nucleic acidsor variants thereof and use of Ste20-like polypeptides or homologuesthereof.

One such use relates to improving the growth characteristics of plants,in particular in improving yield, especially seed yield. The seed yieldmay include one or more of the following: increased total weight ofseeds, increased number of filled seeds and increased Harvest Index.

Ste20-like nucleic acids or variants thereof, or Ste20-like polypeptidesor homologues thereof may find use in breeding programmes in which a DNAmarker is identified which may be genetically linked to a Ste20-likegene or variant thereof. The Ste20-like nucleic acids/genes or variantsthereof, or Ste20-like polypeptides or homologues thereof may be used todefine a molecular marker. This DNA or protein marker may then be usedin breeding programmes to select plants having increased yield. TheSte20-like gene or variant thereof may, for example, be a nucleic acidas represented by any one of SEQ ID NO: 1, SEQ ID NO: 11, SEQ ID NO: 13,SEQ ID NO: 15, SEQ ID NO: 17, SEQ ID NO: 19, SEQ ID NO: 21, SEQ ID NO:23, SEQ ID NO: 25, SEQ ID NO: 27, SEQ ID NO: 29, SEQ ID NO: 31 and SEQID NO: 33.

Allelic variants of a Ste20-like nucleic acid/gene may also find use inmarker-assisted breeding programmes. Such breeding programmes sometimesrequire introduction of allelic variation by mutagenic treatment of theplants, using for example EMS mutagenesis; alternatively, the programmemay start with a collection of allelic variants of so called “natural”origin caused unintentionally. Identification of allelic variants thentakes place, for example, by PCR. This is followed by a step forselection of superior allelic variants of the sequence in question andwhich give increased yield. Selection is typically carried out bymonitoring growth performance of plants containing different allelicvariants of the sequence in question, for example, different allelicvariants of any one of SEQ ID NO: 1, SEQ ID NO: 11, SEQ ID NO: 13, SEQID NO: 15, SEQ ID NO: 17, SEQ ID NO: 19, SEQ ID NO: 21, SEQ ID NO: 23,SEQ ID NO: 25, SEQ ID NO: 27, SEQ ID NO: 29, SEQ ID NO: 31 and SEQ IDNO: 33. Growth performance may be monitored in a greenhouse or in thefield. Further optional steps include crossing plants, in which thesuperior allelic variant was identified, with another plant. This couldbe used, for example, to make a combination of interesting phenotypicfeatures.

A Ste20-like nucleic acid or variant thereof may also be used as probesfor genetically and physically mapping the genes that they are a partof, and as markers for traits linked to those genes. Such informationmay be useful in plant breeding in order to develop lines with desiredphenotypes. Such use of Ste20-like nucleic acids or variants thereofrequires only a nucleic acid sequence of at least 15 nucleotides inlength. The Ste20-like nucleic acids or variants thereof may be used asrestriction fragment length polymorphism (RFLP) markers. Southern blots(Sambrook J, Fritsch E F and Maniatis T (1989) Molecular Cloning, ALaboratory Manual) of restriction-digested plant genomic DNA may beprobed with the Ste20-like nucleic acids or variants thereof. Theresulting banding patterns may then be subjected to genetic analysesusing computer programs such as MapMaker (Lander et al. (1987) Genomics1: 174-181) in order to construct a genetic map. In addition, thenucleic acids may be used to probe Southern blots containing restrictionendonuclease-treated genomic DNAs of a set of individuals representingparent and progeny of a defined genetic cross. Segregation of the DNApolymorphisms is noted and used to calculate the position of theSte20-like nucleic acid or variant thereof in the genetic map previouslyobtained using this population (Botstein et al. (1980) Am. J. Hum.Genet. 32:314-331).

The production and use of plant gene-derived probes for use in geneticmapping is described in Bernatzky and Tanksley (1986) Plant Mol. Biol.Reporter 4: 37-41. Numerous publications describe genetic mapping ofspecific cDNA clones using the methodology outlined above or variationsthereof. For example, F2 intercross populations, backcross populations,randomly mated populations, near isogenic lines, and other sets ofindividuals may be used for mapping. Such methodologies are well knownto those skilled in the art.

The nucleic acid probes may also be used for physical mapping (i.e.,placement of sequences on physical maps; see Hoheisel et al. In:Non-mammalian Genomic Analysis: A Practical Guide, Academic press 1996,pp. 319-346, and references cited therein).

In another embodiment, the nucleic acid probes may be used in directfluorescence in situ hybridization (FISH) mapping (Trask (1991) TrendsGenet. 7:149-154). Although current methods of FISH mapping favour useof large clones (several kb to several hundred kb; see Laan et al.(1995) Genome Res. 5:13-20), improvements in sensitivity may allowperformance of FISH mapping using shorter probes.

A variety of nucleic acid amplification-based methods for genetic andphysical mapping may be carried out using the nucleic acids. Examplesinclude allele-specific amplification (Kazazian (1989) J. Lab. Clin. Med11:95-96), polymorphism of PCR-amplified fragments (CAPS; Sheffield etal. (1993) Genomics 16:325-332), allele-specific ligation (Landegren atal. (1988) Science 241:1077-1080), nucleotide extension reactions(Sokolov (1990) Nucleic Acid Res. 18:3671), Radiation Hybrid Mapping(Walter et al. (1997) Nat. Genet. 7:22-28) and Happy Mapping (Dear andCook (1989) Nucleic Acid Res. 17:6795-6807). For these methods, thesequence of a nucleic acid is used to design and produce primer pairsfor use in the amplification reaction or in primer extension reactions.The design of such primers is well known to those skilled in the art. Inmethods employing PCR-based genetic mapping, it may be necessary toidentify DNA sequence differences between the parents of the mappingcross in the region corresponding to the instant nucleic acid sequence.This, however, is generally not necessary for mapping methods.

The methods according to the present invention result in plants havingincreased yield, as described hereinbefore. These advantageous growthcharacteristics may also be combined with other economicallyadvantageous traits, such as further yield-enhancing traits, toleranceto various stresses, traits modifying various architectural featuresand/or biochemical and/or physiological features.

DESCRIPTION OF FIGURES

The present invention will now be described with reference to thefollowing figures in which:

FIG. 1 shows the typical domain structure of Ste20-like polypeptides.The N-terminal end of the protein comprises a Ser/Thr kinase domain. Themost C-terminal domain (in light grey) has a coiled coil structure,which is usually but not always present.

FIG. 2 shows a binary vector p070, for expression in Oryza sativa of anArabidopsis thaliana Ste20-like coding sequence under the control of aGOS2 promoter (internal reference PRO0129).

FIG. 3 details examples of sequences useful in performing the methodsaccording to the present invention.

EXAMPLES

The present invention will now be described with reference to thefollowing examples, which are by way of illustration alone. Unlessotherwise stated, recombinant DNA techniques are performed according tostandard protocols described in (Sambrook (2001) Molecular Cloning: alaboratory manual, 3rd Edition Cold Spring Harbor Laboratory Press, CSH,New York) or in Volumes 1 and 2 of Ausubel et al. (1994), CurrentProtocols in Molecular Biology, Current Protocols. Standard materialsand methods for plant molecular work are described in Plant MolecularBiology Labfax (1993) by R. D. D. Croy, published by BIOS ScientificPublications Ltd (UK) and Blackwell Scientific Publications (UK).

Example 1 Identification of Homologues of the Ste20-Like Protein of SEQID NO: 2 and Determination of their Similarity/Identity

Sequences (full length cDNA, ESTs or genomic) related to the nucleicacid sequence used in the methods of the present invention wereidentified amongst those maintained in the Entrez Nucleotides databaseat the National Center for Biotechnology Information using databasesequence search tools, such as the Basic Local Alignment Tool (BLAST)(Altschul et al. (1990) J. Mol. Biol. 215:403-410; and Altschul et al.(1997) Nucleic Acids Res. 25:3389-3402). This program is typically usedto find regions of local similarity between sequences by comparingnucleic acid or polypeptide sequences to sequence databases and bycalculating the statistical significance of matches. The polypeptideencoded by the nucleic acid of the present invention was used with theTBLASTN algorithm, with default settings and the filter for ignoring lowcomplexity sequences was set off. The output of the analysis was viewedby pairwise comparison, and ranked according to the probability score(E-value), where the score reflect the probability that a particularalignment occurs by chance (the lower the E-value, the more significantthe hit). In addition to E-values, comparisons were also scored bypercentage identity. Percentage identity refers to the number ofidentical nucleotides (or amino acids) between the two compared nucleicacid (or polypeptide) sequences over a particular length. In someinstances, the default parameters may be adjusted to modify thestringency of the search.

Rice sequences and EST sequences from various plant species may also beobtained from other databases, such as KOME (Knowledge-based OryzaMolecular biological Encyclopedia; Kikuchi et al., Science 301, 376-379,2003), Sputnik (Rudd, S., Nucleic Acids Res., 33: D622-D627, 2005) orthe Eukaryotic Gene Orthologs database (EGO, hosted by The Institute forGenomic Research). These databases are searchable with the BLAST tool.SEQ ID NO: 11 to SEQ ID NO: 34 are nucleic acid and protein sequences ofhomologues of SEQ ID NO: 2 and were obtained from the above-mentioneddatabases using SEQ ID NO: 2 as a query sequence.

Percentages of similarity and identity between the full-length sequencesand the sequences of the kinase domains of Ste20-like proteins weredetermined using MatGAT (Matrix Global Alignment Tool) software (BMCBioinformatics. 2003 4:29. MatGAT: an application that generatessimilarity/identity matrices using protein or DNA sequences. CampanellaJ J, Bitincka L, Smalley J; software hosted by Ledion Bitincka). MatGATsoftware generates similarity/identity matrices for DNA or proteinsequences without needing pre-alignment of the data. The programperforms a series of pair-wise alignments using the Myers and Millerglobal alignment algorithm (with a gap opening penalty of 11, and a gapextension penalty of 1), calculates similarity and identity using forexample Blosum 62 (for polypeptides), and then places the results in adistance matrix. Sequence similarity is shown in the bottom half of thedividing line and sequence identity is shown in the top half of thediagonal dividing line. The sequence of SEQ ID NO: 2 is indicated asnumber 1 in the matrix.

The kinase domains of the Ste20-like proteins were delineated using theSMART tool and the obtained sequences are listed in Table 3.

TABLE 3 list of the kinase domains in the various SEQ ID Nos:SEQ ID NO: 2 YEIICKIGVGVSASVYKAICIPMNSMVVAIKAIDLDQSRADFDSLRRETKTMSLLSHPNILNAYCSFTVDRCLWVVMPFMSCGSLHSIVSSSFPSGLPENCISVFLKETLNAISYLHDQGHLHRDIKAGNILVDSDGSVKLADFGVSASIYEPVTSSSGTTSSSLRLTDIAGTPYWMAPEVVHSHTGYGFKADIWSFGITALELAHGRPPLSHLPPLKSLLMKITKRFHFSDYEINTSGSSKKGNKKFSKAFREMVGLCLEQDPTKRPSAEKLLKHPFF SEQ ID NO: 12YKLYEEIGDGVSATVHRALCIPLNVVVAIKVLDLEKCNNDLDGIRREVQTMSLINHPNVLQAHCSFTTGHQLWVVMPYMAGGSCLHIIKSSYPDGFEEPVIATLLRETLKALVYLHAHGHIHRDVKAGNILLDSNGAVKLADFGVSACMFDTGDRQRSRNTFVGTPCWMAPEVMQQLHGYDFKADVWSFGITALELAHGHAPFSKYPPMKVLLMTLQNAPPGLDYERDKRFSKAFKEMVGTCLVKDPKKRPTSEKLLKHPFF SEQ ID NO: 14YELFEEVGEGVSATVYRARCIALNEIVAVKILDLEKCRNDLETIRKEVHIMSLIDHPNLLKAHCSFIDSSSLWIVMPYMSGGSCFHLMKSVYPEGLEQPIIATLLREVLKALVYLHRQGHIHRDVKAGNILIHSKGVVKLGDFGVSACMFDSGERMQTRNTFVGTPCWMAPEVMQQLDGYDFKYLAHGHAPFSKYPPMKVLLMTLQNAPPRLDYDRDKKFSKSFRELIAACLVKDPKKRPTAAKLLKHPFF SEQ ID NO: 16YEILEEIGDGVYRARCILLDEIVAIKIWNLEKCTNDLETIRKEVHRLSLIDHPNLLRVHCSFIDSSSLWIVMPFMSCGSSLNIMKSVYPNGLEEPVIAILLREILKALVYLHGLGHIHRNVKAGNVLVDSEGTVKLGDFEVSASMFDSVERMRTSSENTFVGNPRRMAPEKDMQQVDGYDFKVDIWSFGMTALELAHGHSPTTVLPLNLQNSPFPNYEEDTKFSKSFRELVAACLIEDPEKRPTASQLLEYPFL SEQ ID NO: 18YTLYEFIGQGVSALVHRALCIPFDEVVAIKILDFERDNCDLNNISREAQTMMLVDHPNVLKSHCSFVSDHNLWVIMPYMSGGSCLHILKAAYPDGFEEAIIATILREALKGLDYLHQHGHIHRDVKAGNILLGARGAVKLGDFGVSACLFDSGDRQRTRNTFVGTPCWMAPEVMEQLHGYDFKADIWSFGITGLELAHGHAPFSKYPPMKVLLMTLQNAPPGLDYERDKKFSRSFKQMIASCLVKDPSKRPSAKKLLKHSFF SEQ ID NO: 20YKLMEEVGYGASAVVHRAIYLPTNEVVAIKSLDLDRCNSNLDDIRREAQTMTLIDHPNVIKSFCSFAVDHHLWVVMPFMAQGSCLHLMKAAYPDGFEEAAICSMLKETLKALDYLHRQGHIHRDVKAGNILLDDTGEIKLGDFGVSACLFDNGDRQRARNTFVGTPCWMAPEVLQPGSGYNSKADIWSFGITALELAHGHAPFSKYPPMKVLLMTIQNAPPGLDYDRDKKFSKSFKELVALCLVKDQTKRPTAEKLLKHSFF SEQ ID NO: 22YKLMEEIGHGASAVVYRAIYLPTNEVVAIKCLDLDRCNSNLDDIRRESQTMSLIDHPNVIKSFCSFSVDHSLWVVMPFMAQGSCLHLMKTAYSDGFEESAICCVLKETLKALDYLHRQGHIHRDVKAGNILLDDNGEIKLGDFGVSACLFDNGDRQRARNTFVGTPCWMAPEVLQPGNGYNSKADIWSFGITALELAHGHAPFSKYPPMKVLLMTIQNAPPGLDYDRDKKFSKSFKEMVAMCLVKDQTKRPTAEKLLKHSCF SEQ ID NO: 24YRLLCKIGSGVSAVVYKAACVPLGSAVVAIKAIDLERSRANLDEVWREAKAMALLSHRNVLRAHCSFTVGSHLWVVMPFMAAGSLHSILSHGFPDGLPEQCIAVVLRDTLRALCYLHEQGRIHRDIKAGNILVDSDGSVKLADFGVSASIYETAPSTSSAFSGPINHAPPPSGAALSSSCFNDMAGTPYWMAPEVIHSHVGYGIKADIWSFGITALELAHGRPPLSHLPPSKSMLMRITSRVRLEVDASSSSSEGSSSAARKKKKFSKAFKDMVSSCLCQEPAKRPSAEKLLRHPFF SEQ ID NO: 26YKLCEEVGDGVSATVYKALCIPLNIEVAIKVLDLEKCSNDLDGIRREVQTMSLIDHPNLLRAYCSFTNGHQLWVIMPYMAAGSALHIMKTSFPDGFEEPVIATLLREVLKALVYLHSQGHIHRDVKAGNILIDTNGAVKLGDFGVSACMFDTGNRQRARNTFVGTPCWMAPEVMQQLHGYDYKADIWSFGITALELAHGHAPFSKYPPMKVLLMTLQNAPPGLDYERDKRFSKSFKDLVATCLVKDPRKRPSSEKLLKHSFF SEQ ID NO: 28YELYEEIGQGVSAIVYRSLCKPLDEIVAVKVLDFERTNSDLWLVVMQVGYTRIVAIYVPPLDLSKMIVTRICLTQNNIMREAQTMILIDQPNVMKAHCSFTNNHSLWVVMPYMAGGSCLHIMKSVYPDGFEEAVIATVLREVLKGLEYLHHHGHIHRDVKAGNILVDSRGVVKLGDFGVSACLFDSGDRQRARNTFVGTPCWMAPEVMEQLHGYDFKADIWSFGITALELAHGHAPFSKFPPMKVLLMTLQNAPPGLDYERDKKFSRHFKQMVAMCLVKDPSKRPTAKKLLKQPFF SEQ ID NO: 30YQLMEEVGYGAHAVVYRALFVPRNDVVAVKCLDLDQLNNNIDEIQREAQIMSLIEHPNVIRAYCSFVVEHSLWVVMPFMTEGSCLHLMKIAYPDGFEEPVIGSILKETLKALEYLHRQGQIHRDVKAGNILVDNAGIVKLGDFGVSACMFDRGDRQRSRNTFVGTPCWMAPEVLQPGTGYNFKADIWSFGITALELAHGHAPFSKYPPMKVLLMTLQNAPPGLDYDRDRRFSKSFKEMVAMCLVKDQTKRPTAEKLLKHSFF SEQ ID NO: 32YRLLEEVGYGANAVVYRAVFLPSNRTVAVKCLDLDRVNSNLDDIRKEAQTMSLIDHPNVIRAYCSFVVDHNLWVIMPFMSEGSCLHLMKVAYPDGFEEPVIASILKETLKALEYLHRQGHIHRDVKRNIIQAGNILMDSPGIVKLGDFGVSACMFDRGDRQRSRNTFVGTPCWMAPEVLQPGAGYNFKKYVSNHLFTNLIWLFKISLRGKNSNYHKNTGNKVLLMTLQNAPPGLDYDRDKRFSKSFKEMVAMCLVKDQTKRPTAEKLLKHSFF SEQ ID NO: 34YKIVDEIGAGNSAVVYKAICIPINSTPVAIKSIDLDRSRPDLDDVRREAKTLSLLSHPNILKAHCSFTVDNRLWVVMPFMAGGSLQSIISHSFQNGLTEQSIAVILKDTLNALSYLHGQGHLHRDIKSGNILVDSNGLVKLADFGVSASIYESNNSVGACSSYSSSSSNSSSSHIFTDFAGTPYWMAPEVIHSHNGYSFKADIWSFGITALELAHGRPPLSHLPPSKSLMLNITKRFKFSDFDKHSYKGHGGSNKFSKAFKDMVALCLNQDPTKRPSAEK LLKHSFF

Results of the MATGAT analysis are shown in Table 4 for the full-lengthsequences and in Table 5 for the kinase domains of the Ste20-likepolypeptides. Percentage identity is given above the diagonal (in bold)and percentage similarity is given below the diagonal (normal font).Percentage identity between kinase domains of Ste20-like paralogues andorthologues of SEQ ID NO: 2 ranges between 44% (for SEQ ID NO: 28) and71% (for SEQ ID NO: 34).

TABLE 4 Sequence similarity\identity for the full-length sequences 1 2 34 5 6 7 8 9 10 11 12 13  1. SEQIDNO2 30.7 30.2 32.6 29.3 28.8 28.9 46.529.0 26.8 35.5 28.9 53.5  2. SEQIDNO12 45.8 45.3 33.9 45.2 42.8 43.128.6 55.1 42.7 41.5 38.3 32.3  3. SEQIDNO14 48.3 62.3 41.2 39.6 37.938.7 28.1 44.3 37.2 40.3 35.7 30.1  4. SEQIDNO16 55.2 48.1 56.4 28.927.4 27.8 30.7 33.7 29.2 35.8 30.1 32.8  5. SEQIDNO18 43.9 63.4 56.843.7 41.5 40.5 27.1 42.3 48.9 37.9 36.4 28.7  6. SEQIDNO20 42.1 61.253.8 41.9 61.4 73.9 26.8 41.2 40.2 45.2 47.7 29.8  7. SEQIDNO22 42.360.8 53.2 41.9 60.8 84.9 25.7 41.2 39.7 44.1 48.5 29.2  8. SEQIDNO2459.9 43.9 48.1 47.8 41.6 42.7 41.9 28.3 26.3 32.2 26.9 47.1  9.SEQIDNO26 44.2 72.4 58.2 47.5 62.8 61.0 60.4 44.3 41.1 39.8 35.9 29.210. SEQIDNO28 39.3 57.7 51.5 41.7 64.9 58.6 56.4 39.3 58.6 36.8 34.225.9 11. SEQIDNO30 51.6 54.5 58.1 52.5 52.2 57.5 56.4 50.8 52.4 49.448.0 33.0 12. SEQIDNO32 42.8 54.9 51.8 42.9 53.9 64.1 63.0 43.9 54.551.1 60.9 29.3 13. SEQIDNO34 69.3 48.2 50.5 53.9 44.8 44.6 45.3 63.545.0 41.5 53.4 48.2

TABLE 5 Sequence similarity\identity for the kinase domain sequences ofTable 3 1 2 3 4 5 6 7 8 9 10 11 12 13  1. SEQIDNO2 54.5 46.6 46.5 48.750.9 51.3 66.0 52.3 44.1 51.3 43.4 70.8  2. SEQIDNO12 69.9 72.5 60.577.5 74.0 74.4 50.2 86.6 70.3 74.4 64.5 52.3  3. SEQIDNO14 62.4 84.765.8 67.6 64.9 64.5 41.9 72.1 61.8 65.3 60.1 43.2  4. SEQIDNO16 65.277.9 82.3 54.5 52.3 52.6 40.9 59.8 51.3 53.8 49.1 44.8  5. SEQIDNO1865.2 89.3 81.7 73.3 73.7 72.1 45.8 74.4 73.3 69.1 61.6 47.7  6.SEQIDNO20 67.4 86.3 78.2 72.9 87.0 93.1 47.0 73.3 63.9 81.3 72.5 51.9 7. SEQIDNO22 67.4 86.3 77.9 72.9 85.9 97.3 47.0 73.3 64.5 81.3 72.152.6  8. SEQIDNO24 77.1 64.6 59.3 58.9 60.9 62.3 62.0 49.2 43.2 47.039.9 64.4  9. SEQIDNO26 68.5 96.2 84.0 78.6 87.0 86.6 87.0 64.0 67.273.3 64.1 51.6 10. SEQIDNO28 62.8 79.1 73.0 66.2 81.8 76.7 76.7 61.378.0 63.2 55.5 43.0 11. SEQIDNO30 66.3 85.5 80.2 74.0 86.3 93.5 92.762.0 85.9 75.7 77.2 49.1 12. SEQIDNO32 62.0 78.0 72.9 66.7 78.8 85.083.9 57.9 78.8 71.6 87.2 44.4 13. SEQIDNO34 80.5 69.0 61.3 64.5 65.969.3 69.3 76.4 68.6 64.9 67.9 66.2

Example 2 Gene Cloning of Ste20-Like

The Arabidopsis thaliana Ste20-like gene was amplified by PCR using astemplate an Arabidopsis thaliana seedling cDNA library (Invitrogen,Paisley, UK). After reverse transcription of RNA extracted fromseedlings, the cDNAs were cloned into pCMV Sport 6.0. Average insertsize of the bank was 1.5 kb and the original number of clones was of theorder of 1.59×10⁷ cfu. Original titer was determined to be 9.6×10⁵cfu/ml after first amplification of 6×10¹¹ cfu/ml. After plasmidextraction, 200 ng of template was used in a 50 μl PCR mix. Primersprm03186 (SEQ ID NO: 3; sense; start codon in bold, AttB1 site initalic: 5′-ggggacaagtttgtacaaaaaagcaggcttca caatggctcggaacaagctc 3′) andprm03187 (SEQ ID NO: 4; reverse, complementary, AttB2 site in italic: 5′ggggaccactttgtacaagaaagctgggtaatagttaacccaaaacactatcttta 3′), whichinclude the AttB sites for Gateway recombination, were used for PCRamplification. PCR was performed using Hifi Taq DNA polymerase instandard conditions. A PCR fragment of 1532 bp (including attB sites)was amplified and purified also using standard methods. The first stepof the Gateway procedure, the BP reaction, was then performed, duringwhich the PCR fragment recombines in vivo with the pDONR201 plasmid toproduce, according to the Gateway terminology, an “entry clone”, p068.Plasmid pDONR201 was purchased from Invitrogen, as part of the Gateway®technology.

Example 3 Vector Construction

The entry clone p068 were subsequently used in an LR reaction withp00640, a destination vector used for Oryza sativa transformation. Thisvector contains as functional elements within the T-DNA borders: a plantselectable marker; a screenable marker expression cassette; and aGateway cassette intended for LR in vivo recombination with the sequenceof interest already cloned in the entry clone. A rice GOS2 promoter(nucleotides 1 to 2193 of SEQ ID NO: 5) for constitutive expression(PRO0129) was located upstream of this Gateway cassette.

After the LR recombination step, the resulting expression vector, p070for Ste20-like (FIG. 2) was transformed into Agrobacterium strainLBA4044 and subsequently to Oryza sativa plants. Transformed rice plantswere allowed to grow and were then examined for the parameters describedin Example 4.

Example 4 Evaluation and Results of Ste20-Like Under the Control of theRice GOS2 Promoter

Approximately 15 to 20 independent TO rice transformants were generated.The primary transformants were transferred from a tissue culture chamberto a greenhouse for growing and harvest of T1 seed. Five events, ofwhich the T1 progeny segregated 3:1 for presence/absence of thetransgene, were retained. For each of these events, approximately 10 T1seedlings containing the transgene (hetero- and homo-zygotes) andapproximately 10 T1 seedlings lacking the transgene (nullizygotes) wereselected by monitoring visual marker expression. The selected T1 plantswere transferred to a greenhouse. Each plant received a unique barcodelabel to link unambiguously the phenotyping data to the correspondingplant. The selected T1 plants were grown on soil in 10 cm diameter potsunder the following environmental settings: photoperiod=11.5 h, daylightintensity=30,000 lux or more, daytime temperature=28° C., night timetemperature=22° C., relative humidity=60-70%. Transgenic plants and thecorresponding nullizygotes were grown side-by-side at random positions.From the stage of sowing until the stage of maturity the plants werepassed several times through a digital imaging cabinet. At each timepoint digital images (2048×1536 pixels, 16 million colours) were takenof each plant from at least 6 different angles.

The mature primary panicles were harvested, bagged, barcode-labelled andthen dried for three days in the oven at 37° C. The panicles were thenthreshed and all the seeds collected. The filled husks were separatedfrom the empty ones using an air-blowing device. After separation, bothseed lots were then counted using a commercially available countingmachine. The empty husks were discarded. The filled husks were weighedon an analytical balance and the cross-sectional area of the seeds wasmeasured using digital imaging. This procedure resulted in the set ofthe following seed-related parameters:

The number of filled seeds was determined by counting the number offilled husks that remained after the separation step. The total seedyield was measured by weighing all filled husks harvested from a plant.Total seed number per plant was measured by counting the number of husksharvested from a plant. Thousand Kernel Weight (TKW) is extrapolatedfrom the number of filled seeds counted and their total weight. Harvestindex is defined as the ratio between the total seed weight and theabove-ground area (mm²), multiplied by a factor 10⁶. These parameterswere derived in an automated way from the digital images using imageanalysis software and were analysed statistically. Individual seedparameters (including width, length, area, weight) were measured using acustom-made device consisting of two main components, a weighing andimaging device, coupled to software for image analysis.

A two factor ANOVA (analyses of variance) corrected for the unbalanceddesign was used as statistical model for the overall evaluation of plantphenotypic characteristics. An F-test was carried out on all theparameters measured of all the plants of all the events transformed withthat gene. The F-test was carried out to check for an effect of the geneover all the transformation events and to verify for an overall effectof the gene, also named herein “global gene effect”. If the value of theF test shows that the data are significant, than it is concluded thatthere is a “gene” effect, meaning that not only presence or the positionof the gene is causing the effect. The threshold for significance for atrue global gene effect is set at 5% probability level for the F test.

To check for an effect of the genes within an event, i.e., for aline-specific effect, a t-test was performed within each event usingdata sets from the transgenic plants and the corresponding null plants.“Null plants” or “null segregants” or “nullizygotes” are the plantstreated in the same way as the transgenic plant, but from which thetransgene has segregated. Null plants can also be described as thehomozygous negative transformed plants. The threshold for significancefor the t-test is set at 10% probability level. The results for someevents can be above or below this threshold. This is based on thehypothesis that a gene might only have an effect in certain positions inthe genome, and that the occurrence of this position-dependent effect isnot uncommon. This kind of gene effect is also named herein a “lineeffect of the gene”. The p-value is obtained by comparing the t-value tothe t-distribution or alternatively, by comparing the F-value to theF-distribution. The p-value then gives the probability that the nullhypothesis (i.e., that there is no effect of the transgene) is correct.

The data obtained for Ste20 in the first experiment were confirmed in asecond experiment with T2 plants. Four lines that had the correctexpression pattern were selected for further analysis. Seed batches fromthe positive plants (both hetero- and homozygotes) in T1, were screenedby monitoring marker expression. For each chosen event, the heterozygoteseed batches were then retained for T2 evaluation. Within each seedbatch an equal number of positive and negative plants were grown in thegreenhouse for evaluation.

A total number of 120 Ste20 transformed plants were evaluated in the T2generation, that is 30 plants per event of which 15 positives for thetransgene, and 15 negatives.

Because two experiments with overlapping events had been carried out, acombined analysis was performed. This is useful to check consistency ofthe effects over the two experiments, and if this is the case, toaccumulate evidence from both experiments in order to increaseconfidence in the conclusion. The method used was a mixed-model approachthat takes into account the multilevel structure of the data (i.e.experiment-event-segregants). P-values are obtained by comparinglikelihood ratio test to chi square distributions.

Example 5 Evaluation of Ste20 Transformants: Measurement ofYield-Related Parameters

Upon analysis of the seeds as described above, the inventors found thatplants transformed with the Ste20 gene construct had a higher seedyield, expressed as number of filled seeds, total weight of seeds andharvest index, compared to plants lacking the Ste20 transgene.

The results obtained for plants in the T1 generation are summarised inTable 6:

TABLE 6 % difference p-value Nr filled seeds +38 0.0003 Total weightseeds +38 0.0004 Harvest Index +42 0.0001

These positive results were again obtained in the T2 generation. InTable 7, data show the overall % increases for the number of filledseeds, total weight of seeds and harvest index, calculated from the dataof the individual lines of the T2 generation, and the respectivep-values. These T2 data were re-evaluated in a combined analysis withthe results for the T1 generation, and the obtained p-values show thatthe observed effects were highly significant.

TABLE 7 T2 generation Combined analysis % difference p-value p-value Nrfilled seeds +30 0.0004 0.0000 Total weight seeds +29 0.0008 0.0000Harvest Index +33 0.0001 0.0000

1-13. (canceled)
 14. A construct comprising: (i) a Ste20-like nucleicacid or variant thereof; (ii) one or more control sequences capable ofdriving expression of the nucleic acid sequence of (i); and optionally(iii) a transcription termination sequence.
 15. The construct of claim14, wherein said control sequence is a constitutive promoter.
 16. Theconstruct of claim 15, wherein said constitutive promoter is a GOS2promoter.
 17. The construct of claim 16, wherein said GOS2 promotercomprises nucleotides 1 to 2193 of SEQ ID NO:
 5. 18. A plant transformedwith the construct of claim
 14. 19-27. (canceled)
 28. A method ofproducing a transgenic plant, plant, or part thereof, comprisingintroducing the construct of claim 14 into a plant cell, plant, or partthereof.